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(54) Improved fermentative carotenoid production 

(57) The present invention is directed to processes 
for the preparation of canthaxanthin, adonixanthin, 
astaxanthin, a mixture of adonixanthin and astaxanthin 
and zeaxanthin by a cell which has been transformed by 
DNA sequences encoding the respective biosynthetic 
enzymes of Flavobacterium and the gram negative bac- 
terium E-396. Furthermore the present invention is 
directed to a food or feed composition comprising one 
or more of the aforementioned carotenoids. 
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Over 600 different carotenoids have been described from carotenogenic organisms found among bacteria, yeast, 
fungi and plants. Currently only two of them, p-carotene and astaxanthin are commercially produced in microorganisms 

5 and used in the food and feed industry, p-carotene is obtained from algae and astaxanthin is produced in Pfaffia strains 
which have been generated by classical mutation. However, fermentation in Pfaffia has the disadvantage of long fer- 
mentation cycles and recovery from algae is cumbersome. Therefore it is desiderable to develop production systems 
which have better industrial applicability, e.g. can be manipulated for increased titers and/or reduced fermentation 
times. Two such systems using the biosynthetic genes form Erwinia herbicola and Erwinia uredovora have already been 

w described in WO 91/13078 and EP 393 690, respectively. Furthermore, three p-carotene ketolase genes (p-carotene p- 
4-oxygenase) of the marine bacteria Agrobacterium aurantiacum and Alcaligenes strain PC-1 (crtW) [Misawa, 1995, 
Biochem. Biophys. Res. Com. 209, 867-876][Misawa, 1995, J. Bacteriology 177, 6575-6584] and from the green algae 
Haematococcus pluvialis (bkt) [Lotan, 1995, FEBS Letters 364, 125-128][ Kajiwara, 1995, Plant Mol. Biol. 29, 343-352] 
have been cloned. E. coli carrying either the carotenogenic genes (crtE, crtB, crtY and crtl) of E. herbicola [Hundle, 

is 1994, MGG 245, 406-416] or of E. uredovora and complemented with the crtW gene of A. aurantiacum {Misawa, 1995] 
or the bkt gene of H. pluvialis [Lotan, 1995][Kajiwara, 1995] resulted in the accumulation of canthaxanthin (p,p-caro- 
tene-4,4'-dione), originating from the conversion of p-carotene, via the intermediate echinenone (p.p-carotene-4-one). 
Introduction of the above mentioned genes (crtW or bkt) into E. coli cells harbouring besides the carotenoid biosynthe- 
sis genes mentioned above also the crtZ gene of E. uredovora [Kajiwara, 1 995][Misawa, 1995], resulted in both cases 

so in the accumulation of astaxanthin (3,3'-dihydroxy-p,p-carotene-4,4'-dione). The results obtained with the bkt gene, are 
in contrast to the observation made by others [Lotan, 1995], who using the same experimental set-up, but introducing 
the H. pluvialis bkt gene in a zeaxanthin (p,p-carotene-3,3'-diol) synthesising E. coli host harbouring the carotenoid bio- 
synthesis genes of E. herbicola, a close relative of the above mentioned E. uredovora strain, did not observe astaxan- 
thin production. 

25 Since there is a continuing need in even more optimized fermentation systems for industrial application it is there- 
fore in the first instance an object of the present invention to provide a process for the preparation of canthaxanthin by 
culturing under suitable culture conditions a cell which is transformed by a DNA sequence comprising the following DNA 
sequences: 

30 a) a DNA sequence which encodes the GGPP synthase of Flavobacterium sp. R1534 (crtE) or a DNA sequence 
which is substantially homologous; 

b) a DNA sequence which encodes the prephytoene synthase of Flavobacterium sp. R1534 (crtB) or a DNA 
sequence which is substantially homologous; 

35 

c) a DNA sequence which encodes the phytoene desaturase of Flavobacterium sp. R1534 (crtl) or a DNA 
sequence which is substantially homologous; 

d) a DNA sequence which encodes the lycopene cyclase of Flavobacterium sp. R1534 (crtY) or a DNA sequence 
40 which is substantially homologous; 

e) a DNA sequence which encodes the p-carotene p4-oxygenase of the microorganism E-396 (FERM BP-4283) 
[crtW E396 ] or a DNA sequence which is substantially homologous; 

45 or a cell which is transformed by a vector comprising DNA sequences specified above under a) to e) and by isolat- 
ing canthaxanthin from such cells or the culture medium by methods known in the art. 

Furthermore it is in the second instance an object of the present invention to provide a process for the preparation 
of a mixture of adonixanthin and astaxanthin or adonixanthin or astaxanthin alone by a process as mentioned above 
so characterized therein that in addition to the DNA sequences specified under a) to e) the following additional DNA 
sequence is present: 

f) a DNA sequence which encodes the p-carotene hydroxylase of the microorganism E-396 (FERM BP-4283) 
[crtZ E396 ] or a DNA sequence which is substantially homologous; 

55 

and the DNA sequence specified under e) is as specified above or the following sequence: 

g) a DNA sequence which encodes the p-carotene p4-oxygenase of Alcaligenes strain PC-1 (crtW) or a DNA 
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sequence which is substantially homologous; 



and isolating the desired mixture of adonixanthin and astaxanthin or adonixanthin or a astaxanthin alone from such 
cells of the culture medium and separating the desired mixture or carotenoids alone from other carotenoids which 
5 might be present by methods known in the art. 

Furthermore it is an object of the present invention to provide a process for the preparation of zeaxanthin by a proc- 
ess as claimed in the first instance characterized therein that the DNA sequence as specified under e) is replaced by 
the DNA sequence as specified under f) in the second instance and by isolating zeaxanthin from the cell or the culture 
10 medium and separating it from other carotenoids which might be present by methods known in the art. 

Furthermore it is an object of the present invention to provide a process for the production of adonixanthin by cul- 
turing under suitable culture conditions a cell which is transformed by a DNA sequence comprising the following heter- 
ologous DNA sequences: 

15 a) a DNA sequence which encodes the GGPP synthase of the microorganism E-396 (FERM BP-4283) [crtE E396 ] 
or a DNA sequence which is substantially homologous; 

b) a DNA sequence which encodes the prephytoene synthase the microorganism E-396 (FERM BP-4283) 
[crtB E396 ] or a DNA sequence which is substantially homologous; 

20 

c) a DNA sequence which encodes the phytoene desaturase of the microorganism E-396 (FERM BP-4283) 
[crtl E396 ] or a DNA sequence which is substantially homologous; 

d) a DNA sequence which encodes the lycopene cyclase of the microorganism E-396 (FERM BP-4283) [crtY E3 g 6 ] 
25 or a DNA sequence which is substantially homologous; 

e) a DNA sequence which encodes the p-carotene hydroxylase of the microorganism E396 (FERM BP-4283) 
[crtZ E 3 96 ] or a DNA sequence which is substantially homologous; and 

30 f) a DNA sequence which encodes the p-carotene p4-oxygenase of the microorganism E396 (FERM BP-4283) 
[crtW E396 ] or a DNA sequence which is substantially homologous; 

and isolating adonixanthin from the cell or the culture medium and separating it from other carotenoids which might 
be present by methods known in the art. 

35 

Further it is an object of the present invention to provide a process as described above characterized therein that 
the transformed host cell is a prokaryotic host cell, like E. coli, Bacillus or Flavobacter and a process as described above 
characterized therein that the transformed host cell is a eukaryotice host cell, like yeast or a fungal cell. 

Furthermore it is an object of the present invention to provide a DNA sequence comprising one or more DNA 
40 sequences selected from the group consisting of: 

a) a DNA sequence which encodes the GGPP synthase of Flavobacterium sp. R1534 (crtE) or a DNA sequence 
which is substantially homologous; 

45 b) a DNA sequence which encodes the prephytoene synthase of Flavobacterium sp. R1534 (crtB) or a DNA 
sequence which is substantially homologous; 

c) a DNA sequence which encodes the phytoene desaturase of Flavobacterium sp. R1534 (crtl) or a DNA 
sequence which is substantially homologous; 

50 

d) a DNA sequence which encodes the lycopene cyclase of Flavobacterium sp. R1534 (crtY) or a DNA sequence 
which is substantially homologous; and 

e) a DNA sequence which encodes the p-carotene hydroxylase of Flavobacterium sp. R1534 (crtZ) or a DNA 
55 sequence which is substantially homologous. 

It is also an object of the present invention to provide a vector comprising such DNA sequence, preferably in the 
form of an expression vector. Furthermore it is an object of the present invention to provide a cell which is transformed 
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by such DNA sequence or vector, preferably which is a prokaryotic cell and more preferably which is E. coli or a Bacillus 
strain. Such transformed cell which is an eukaryotic cell, preferably a yeast cell or a fungal cell, is also an object of the 
present invention. Finally the present invention concerns also a process for the preparation of a desired carotenoid or 
a mixture of carotenoids by culturing such a cell under suitable culture conditions and isolating the desired carotenoid 
5 or a mixture of carotenoids from such cells or the culture medium and. in case only one carotenoid is desired separating 
it by methods known in the art from other carotenoids which might also be present and a process for the preparation of 
a food or feed composition characterized therein that after such a process has been effected the carotenoid or carote- 
noid mixture is added to food or feed. 

Furthermore, a DNA sequence comprising the following DNA sequences is an object of the present invention: 

w 

a) a DNA sequence which encodes the GGPP synthase of Flavobacterium sp. R1534 (crtE) or a DNA sequence 
which is substantially homologous; 

b) a DNA sequence which encodes the prephytoene synthase of Flavobacterium sp. R1534 (crtB) of a DNA 
15 sequence which is substantially homologous; and 

c) a DNA sequence which encodes the phytoene desaturase of Flavobacterium sp. R 1534 (crtl) or a DNA 
sequence which is substantially homologous. 

20 It is also an object of the present invention to provide a vector comprising such DNA sequence, preferably in the 
form of an expression vector. Furthermore it is an object of the present invention to provide a cell which is transformed 
by such DNA sequence or vector, preferably which is a prokaryotic cell and more preferably which is E. coli or a Bacillus 
strain. Such transformed cell which is an eukaryotic cell, preferably a yeast cell or a fungal cell, is also an object of the 
present invention. Finally the present invention concerns also a process for the preparation of a desired carotenoid or 

25 a mixture of carotenoids by culturing such a cell under suitable conditions and isolating the desired carotenoid or a mix- 
ture of carotenoids from such cells or the culture medium and, in case only one carotenoid is desired separating it by 
methods known in the art from other carotenoids which might also be present, preferably such a process for the prep- 
aration of lycopene and a process for the preparation of a food or feed composition characterized therein that after such 
a process has been effected the carotenoid, preferably lycopene or carotenoid mixture, preferably a lycopene compris- 

30 ing mixture is added to food or feed. 

Furthermore a DNA sequence comprising the following DNA sequence is also an object of the present invention: 

a) a DNA sequence which encodes the GGPP synthase of Flavobacterium sp. R1534 (crtE) or a DNA sequence 
which is substantially homologous; 

35 

b) a DNA sequence which encodes the prephytoene synthase of Flavobacterium sp. R1534 (crtB) or a DNA 
sequence which is substantially homologous; 

c) a DNA sequence which encodes the phytoene desaturase of Flavobacterium sp. R1534 (crtl) or a DNA 
40 sequence which is substantially homologous; and 

d) a DNA sequence which encodes the lycopene cyclase of Flavobacterium sp. R1534 (crtY) or a DNA sequence 
which is substantially homologous. 

45 It is also an object of the present invention to provide a vector comprising such DNA sequence, preferably in the 
form of an expression vector. Furthermore it is an object of the present invention to provide a cell which is transformed 
by such DNA sequence or vector, preferably which is a prokaryotic cell and more preferably which is E. coli or a Bacillus 
strain. Such transformed cell which is an eukaryotic cell, preferably a yeast cell or a fungal cell, is also an object of the 
present invention. Finally the present invention concerns also a process for the preparation of a desired carotenoid or 

so a mixture of carotenoids by culturing such a cell under suitable conditions and isolating the desired carotenoid or a mix- 
ture of carotenoids from such cells or the culture medium and, in case only one carotenoid is desired separating it by 
methods known in the art from other carotenoids which might also be present, preferably such a process for the prep- 
aration of p-carotene and a process for the preparation of a food or feed composition characterized therein that after 
such a process has been effected the carotenoid, preferably p-carotene or carotenoid mixture, preferably a p-carotene 

55 comprising mixture is added to food or feed. 

Furthermore a cell which is transformed by the above mentioned DNA sequence comprising subsequences a) to 
d) or the vector comprising it and a second DNA sequence which encodes the p-carotene p4-oxygenase of Alcaligenes 
strain PC-1 (crt W) or a DNA sequence which is substantially homologous or a second vector which comprises a DNA 
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sequence which encodes the p-carotene p4-oxygenase of Alcaligenes strain PC-1 (crt W) or a DNA sequence which is 
substantially homologous; and a process for the preparation of a desired carotenoid or a mixture of carotenoids by cul- 
turing such cell under suitable conditions and isolating the desired carotenoid or a mixture of carotenoids from such 
cells or the culture medium and, in case only one carotenoid is desired separating it by methods known in the art from 
5 other carotenoids which might also be present, preferably such a process for the preparation of echinenone and a proc- 
ess for the preparation of a food or feed composition characterized therein that after such a process has been effected 
the carotenoid, preferably echinenone or carotenoid mixture, preferably an echinenone comprising mixture is added to 
food or feed. 

Furthermore it is an object of the present invention to provide a DNA sequence as mentioned above comprising 
jo subsequences a) to d) and a DNA sequence which encodes the p-carotene p4-oxygenase of Alcaligenes strain PC-1 
(crt W) or a DNA sequence which is substantially homologous and a vector comprising such DNA sequence, preferably 
in the form of an expression vector. Furthermore it is an object of the present invention to provide a cell which is trans- 
formed by such DNA sequence or vector, preferably which is a prokaryotic cell and more preferably which is E. coli or 
a Bacillus strain. Such transformed cell which is an eukaryotic cell, preferably a yeast cell or a fungal cell, is also an 
15 object of the present invention. Finally the present invention concerns also a process for the preparation of a desired 
carotenoid or a mixture of carotenoids by culturing such a cell under suitable culture conditions and isolating the desired 
carotenoid or a mixture of carotenoids from such cells of the culture medium and, in case only one carotenoid is desired 
separating it by methods known in the art from other carotenoids which might also be present, especially such a proc- 
ess for the preparation of echinenone or canthaxanthin and a process for the preparation of a food or feed compositing 
so characterized therein that after such a process has been effected the carotenoid, preferably echinenone or canthaxan- 
thin or carotenoid mixture, preferably a echinenone or canthaxanthin containing mixture is added to food or feed. 
Furthermore a DNA sequence comprising the following DNA sequences is also an object of the present invention: 

a) a DNA sequence which encodes the GGPP synthase of Flavobacterium sp. R1534 (crtE) or a DNA sequence 
25 which is substantially homologous; 

b) a DNA sequence which encodes the prephytoene synthase of Flavobacterium sp. R1534 (crtB) or a DNA 
sequence which is substantially homologous; 

30 c) a DNA sequence which encodes the phytoene desaturase of Flavobacterium sp. R1534 (crtl) or a DNA 
sequence which is substantially homologous; 

d) a DNA sequence which encodes the lycopene cyclase of Flavobacterium sp. R1534 (crtY) or a DNA sequence 
which is substantially homologous; and 

35 

e) a DNA sequence which encodes the p-carotene hydroxylase of Flavobacterium sp. R1534 (crtZ) or a DNA 
sequence which is substantially homologous. 

It is also an object of the present invention to provide a vector comprising such DNA sequence, preferably in the 

40 form of an expression vector. Furthermore it is an object of the present invention to provide a cell which is transformed 
by such DNA sequence or vector, preferably which is a prokaryotic cell and more preferably which is E. coli or a Bacillus 
strain. Such transformed cell which is an eukaryotic cell, preferably a yeast cell or a fungal cell, is also an object of the 
present invention. Finally the present invention concerns also a process for the preparation of a desired carotenoid or 
mixture of carotenoids by culturing such a cell under suitable conditions and isolating the desired carotenoid or a mix- 

45 ture of carotenoids from such cells of the culture medium and, in case only one carotenoid is desired separating it by 
methods known in the art from other carotenoids which might also be present, preferably such a process for the prep- 
aration of zeanxanthin and a process for the preparation of a food or feed composition characterized therein that after 
such a process has been effected the carotenoid, preferably zeaxanthin or the carotenoid mixture, preferably a zeaxan- 
thin containing mixture is added to food or feed. 

so Furthermore a DNA sequence as mentioned above comprising subsequences a) to e) and in addition a DNA 
sequence which encodes the p-carotene p4-oxygenase of Alcaligenes strain PC-1 (crt W) of a DNA sequence which is 
substantially homologous is an object of the present invention and to provide a vector comprising such DNA sequence, 
preferably in form of an expression vector. Furthermore it is an object of the present invention to provide a cell which is 
transformed by such DNA sequence or vector, preferably which is a prokaryotic cell and more preferably which is E. coli 

55 or a Bacillus strain. Such transformed cell which is an eukaryotic cell, preferably a yeast cell or a fungal cell, is also an 
object of the present invention. Finally the present invention concerns also a process for the preparation of a desired 
carotenoid or a mixture of carotenoids by culturing such a cell under suitable culture conditions and isolating the desired 
carotenoid or a mixture or carotenoids from such cells of the culture medium and, in case only one carotenoid is desired 
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separting it by methods known in the art from other carotenoids which might also be present, preferably such a process 
for the preparation of zeaxanthin, adonixanthin or astaxanthin and a process for the preparation of a food or feed com- 
position characterized therein that after such a process has been effected the carotenoid, preferably zeaxanthin, adon- 
ixanthin or astaxanthin or carotenoid mixture, preferably a zeaxanthin, adonixanthin or astaxanthin containing mixture 

5 is added to food or feed. 

Furthermore a cell which is transformed by the DNA sequence mentioned above comprising subsequences a) to 
e) or a vector comprising such DNA sequence and a second DNA sequence which encodes the p-carotene f34-oxyge- 
nase of Alcaligenes strain PC-1 (crt W) or a DNA sequence which is substantially homologous or a second vector which 
comprises a DNA sequence which encodes the p-carotene p4-oxygenase of Alcaligenes strain PC-1 (crt W) or a DNA 

io sequence which is substantially homologous is also an object of the present invention and a process for the preparation 
of a desired carotenoid or a mixture of carotenoids by culturing such a cell under suitable conditions and isolating the 
desired carotenoid or a mixture of carotenoids from such cells or the culture medium, and in case only one carotenoid 
is desired separating it by methods known in the art from other carotenoids which might also be present, preferably 
such a process for the preparation of zeanxanthin or adonixanthin and a process for the preparation of a food or feed 

is composition characterized therein that after such a process has been effected the carotenoid, preferably zeaxanthin or 
adonixanthin or carotenoid mixture, preferably a zeaxanthin or adonixanthin containing mixture is added to food or feed. 

Furthermore it is an object ot the present invention to provide the DNA sequences and vectors as specified before 
and a process for the preparation of a food or feed composition characterized therein that after a process as specified 
before has been effected the carotenoid prepared by such process is added to food or feed. 

20 In this context it should be mentioned that the expression "a DNA sequence is substantially homologous" refers 
with respect to the crtE encoding DNA sequence to a DNA sequence which encodes an amino acid sequence which 
shows more than 45 %, preferably more than 60 % and more preferably more than 75 % and most preferably more than 
90 % identical amino acids when compared to the amino acid sequence of crtE of Flavobacterium sp. 1534 and is the 
amino acid sequence of a polypeptide which shows the same type of enzymatic activity as the enzyme encoded by crtE 

25 of Flavobacterium sp. 1534. In analogy with respect to crtB this means more than 60 %, preferably more than 70 %, 
more preferably more than 80 % and most preferably more than 90 %; with respect to crtl this means more than 70 %, 
preferably more than 80 % and most preferably more than 90 %; with respect to crtY this means 55 %, preferably 70 %, 
more preferably 80 % and most preferably 90 %. 

"DNA sequences which are substantially homologous" refer with respect to the crtW E396 encoding DNA sequence 

30 to a DNA sequence which encodes an amino acid sequence which shows more than 60%, preferably more than 75% 
and most preferably more than 90% identical amino acids when compared to the amino acid sequence of crtW of the 
microorganism E 396 (FERM BP-4283) and is the amino acid sequence of a polypeptide which shows the same type 
of enzymatic activity as the enzyme encoded by crtW of the microorganism E 396. In analogy with respect to crtZ E396 
this means more than 75%, preferable more than 80% and most preferably more than 90%; with respect to crtE E396 , 

35 crtB E396 , crtl E396 , crtY E396 and crtZ E396 this means more than 80%, preferably more than 90% and most preferably 
95%. 

DNA sequences in form of genomic DNA, cDNA or synthetic DNA can be prepared as known in the art [see e.g. 
Sambrook et al., Molecular Cloning, Cold Spring Habor Laboratory Press 1989] or, e.g. as specifically described in 
Examples 1 , 2 or 7. In the context of the present invention it should be noted that all DNA sequences used for the proc- 

40 ess for production of carotenoids of the present invention encoding crt-gene products can also be prepared as synthetic 
DNA sequences according to known methods or in analogy to the method specifically described for crtW in Example 7. 

The cloning of the DNA-sequences of the present invention from such genomic DNA can than be effected, e.g. by 
using the well known polymerase chain reaction (PCR) method. The principles of this method are outlined e.g. in PCR 
Protocols: A guide to Methods and Applications, Academic Press, Inc. (1990). PCR is an in vitro method for producing 

45 large amounts of a specific DNA of defined length and sequence from a mixture of different DNA-sequences. Thereby, 
PCR is based on the enzymatic amplification of the specific DNA fragment of interest which is flanked by two oligonu- 
cleotide primers which are specific for this sequence and which hybridize to the opposite strand of the target sequence. 
The primers are oriented with their 3' ends pointing toward each other. Repeated cycles of heat denaturation of the tem- 
plate, annealing of the primers to their complementary sequences and extension of the annealed primers with a DNA 

so polymerase result in the amplification of the segment between the PCR primers. Since the extension product of each 
primer can serve as a template for the other, each cycle essentially doubles the amount of the DNA fragment produced 
in the previous cycle. By utilizing the thermostable Taq DNA polymerase, isolated from the thermophilic bacteria Ther- 
mus aquaticus, it has been possible to avoid denaturation of the polymerase which necessitated the addition of enzyme 
after each heat denaturation step. This development has led to the automation of PCR by a variety of simple tempera- 

55 ture-cycling devices. In addition, the specificity of the amplification reaction is increased by allowing the use of higher 
temperatures for primer annealing and extension. The increased specificity improves the overall yield of amplified prod- 
ucts by minimizing the competition by non-target fragments for enzyme and primers. In this way the specific sequence 
of interest is highly amplified and can be easily separated from the non-specific sequences by methods known in the 
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art p a bv seoaration on an agarose gel and cloned by methods known in the art using vectors as described e.g^by 
H^aS 

etal in Nucleic Acid Res. m 1154 (1991) or Mead et al. in B.o/Technology a, 657-663 (1991). 

The oligonucleotide primers used in the PCR procedure can be prepared as known in the art and described e.g. .n 
Sa X°iHi e ^NA a sequences can than be used to screen DNA libraries by methods known in the art (Sambrook etal.. 

define new PCR primers for the cloning of substantially homologous DNA sequences from othe source s. In addfton 
me andTuch homologous DNA sequences can be integrated into vectors by methods known «"dd«c^ 
a in Sambrook et al (s.a.) to express or overexpress the encoded polypeptide(s) in appropriate host systems H» 
eve a man sk^ 

: temfof the invention to get overexpression of the encoded po.ype P tide p Appropriate S K 

Bacteria e a E coli Bacilli as, e.g. Bacillus subtilis or Flavobacter strains. E. col., which could be used are E coh K12 
Sains e g MlS^escr^ed as DZ 291 by Vil.arejo et al. in J. Bacteriol. 120, 466-474 (1974)]. HB 101 [ATCC ^No. 33694 
or E L?SGl300?[Gottesman et al., J Bacteriol. US, 265-273 (1981)]. Suitable Flavobacter strains can be obtained 
Soma 0 hS^ 

January 1994 pgs 29-40). like the American Type Culture Collection (ATCC) or the Centralbureau voor Sch,mme.ku. 
turi fCBSland are e g Flavobacterium sp. R 1534 (ATCC No. 21588, classtfied as unknown bacterium, or as CBS 
SS^an"S££ stains listed as CBS 517.67 to CBS 521*7 and CBS ^^fj^i Furthef Fla 
1533 (which is CBS 523.67 or ATCC 21081 , classified as unknown bactenum; see also USP 3^41 ,967). Further r-ia 
voL^r st ainsare also described in WO 91/03571. Suitable eukaryotic host systems are for example fungi hke 
CeSi e g Aspergillus niger [ATCC 9142] or yeasts, like Saccharomyces, e.g. Saccharomyces cerev,s.ae or P,ch.a. 

et a. in P^cd. 8th Int. Biotechnology Symposium" [Soc. Franc, de Microbiol., Par. « (Dunjnd e a I edj) PP- 680-697 
(1988)] or by Bujard et al. in Methods in Enzymology, eds. Wu and Grossmann, Academ.c Press, Ina VoL m 416 433 
987 and Stiiber et al. in Immunological Methods, eds. Lefkovits and Pern.s, Academic Pressjna, Vol. IV. 121 -152 
2 . Vectors which could be used for expression in Bacilli are known in the art and ^ ^ 

, V PP ' «- 7P Procd Nat Acad Sci usa 81 , 439 (1984) by Yansura and Henner. Meth. Enzym. 185, 1 99-228 (1990) or 

y ip^s.™ 

««♦ \ n cp 070 EP 183 071 EP 248 227 EP 263 31 1 . Vectors which can be used for expression in Ma- 

5 "c^SJSS^S^'have been expressed in an appropriate host cel. in a suitable medium the carotenoids 
5 can b 2 the medium in the^case they are secreted into the medium or from the 

i, necessary separated from other carotenoids if present in case one specific caroteno. Us dejired by 
in the art (see e.g. Carotenoids Vol IA: Isolation and Analysis, G. Brirton, S. Liaaen-Jensen, H. Pfander, 1995. 

o B ™TTLrZo«s7L present invention can be used in a process for the preparation of food or feeds. A man 

nents aenerally used for such purpose and known in the state of the art. n > arviar , in 

fiS^il^Bon has been described in general hereinbefore, the following figures and examples are intended to 
illustrate details of the invention, without thereby limiting it in any matter. 

' 5 FiourjLl- The biosynthesis pathway for the formation or carotenoids of Flavobacterium sp. R1534 ^illustrated 
explaining the enzymatic activities which are encoded by DNA sequences of the present invention. 



Figure 2 : 



Southern blot of genomic Flavobacterium sp. R1534 DNA digested with the ^thction enzym es shown 
on top of each lane and hybridized with Probe 46F. The arrow indicated the isolated 2.4 kb Xhol/Pstl frag- 
ment. 

Southern blot of genomic Flavobacterium sp. R1534 DNA digested with Clal or double digested with , Ctal 
and Hindlll. Blots shown in Panel A and B were hybridized to probe A or probe B. respectively (see exam- 
ples). Both Clal/Hindlll fragments of 1 .8 kb and 9.2 kb are indicated. 

Southern blot of genomic Flavobacterium sp. R1534 DNA digested with the restriction enzymes shown 
on top of each lane and hybridized to probe C. The isolated 2.8 kb Sal l/Hindlll fragment ,s shown by the 
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Figure 5 : Southern blot of genomic Flavobacterium sp. R1534 DNA digested with the restriction enzymes shown 
on top of each lane and hybridized to probe D. The isolated Bcll/Sphl fragment of approx. 3 kb is shown 
by the arrow. 

Figure 6 : Physical map of the organization of the carotenoid biosynthesis cluster in Flavobacterium sp. R1534, 
deduced from the genomic clones obtained. The location of the probes used for the screening are shown 
as bars on the respective clones. 

Figure 7 : Nucleotide sequence of the Flavobacterium sp. R1534 carotenoid biosynthesis cluster and its flanking 
regions. The nucleotide sequence is numbered from the first nucleotide shown (see BamHI site of Fig. 
6). The deduced amino acid sequence of the ORF's (orf-5, orf-1 , crtE, crtB, crtl, crtY, crtZ and orf-16) are 
shown with the single-letter amino acid code. Arrow (-->) indicate the direction of the transcription; aster- 
isks, stop codons. 

Figure 8 . Protein sequence of the GGPP synthase (crtE) of Flavobacterium sp. R1534 with a MW of 31331 Da. 

Figure 9 : Protein sequence of the prephytoene synthetase (crtB) of Flavobacterium sp. R1 534 with a MW of 
32615 Da. 



Figure 10 : Protein sequence of the phytoene desaturase (crtl) of Flavobacterium sp. R1534 with a MW of 5441 1 
Da. 



25 Figure 11 : Protein sequence of the lycopene cyclase (crtY) of Flavobacterium sp. R1534 with a MW of 42368 Da. 

Figure 12 : Protein sequence of the p-carotene hydroxylase (crtZ) of Flavobacterium sp. R1 534 with a MW of 1 9282 
Da. 



30 Figure 13 : Recombinant plasmids containing deletions of the Flavobacterium sp. R1534 carotenoid biosynthesis 
gene cluster. 

Figure 14 : Primers used for PCR reactions. The underlined sequence is the recognition site of the indicated restric- 
tion enzyme. Small caps indicate nucleotides introduced by mutagenesis. Boxes show the artificial RBS 
35 which is recognized in B. subtil is. Small caps in bold show the location of the original adenine creating 

the translation start site (ATG) of the following gene (see original operon). All the ATG's of the original 
Flavobacter carotenoid biosynthetic genes had to be destroyed to not interfere with the rebuild transcrip- 
tion start site. Arrows indicate start and ends of the indicated Flavobacterium R1534 WT carotenoid 
genes. 

40 

Figure 15 : Linkers used for the different constructions. The underlined sequence is the recognition site of the indi- 
cated restriction enzyme. Small caps indicate nucleotides introduced by synthetic primers. Boxes show 
the artificial RBS which is recognized in B. subtilis. Arrow indicate start and ends of the indicated Flavo- 
bacterium carotenoid genes. 

45 

Figure 16 : Costruction of plasmids pBIIKS(+)-clone59-2, pLyco and pZea4. 



Figure 17 : Construction of plasmid p602CAR. 



so Figure 18 : Construction of plasmids pBIIKS(+)-CARVEG-E and p602 CARVEG-E. 

Figure 19 : Construction of plasmids pHP13-2CARZYIB-EINV and pHP13-2PN25ZYIB-EINV. 



Figure 20 : Construction of plasmid pXI12-ZYIB-EINVMUTRBS2C. 

Figure 21 : Norhern blot analysis of B. subtilis strain BS1012::ZYIB-EINV4. Panel A: Schematic representation of a 
reciprocal integration of plasmid pXI12-ZYIB-EINV4 into the levan-sucrase gene of B-subtilis. Panel B: 
Northern blot obtained with probe A (PCR fragment which was obtained with CAR 51 and CAR 76 and 
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hybridizes to the 3' end of crtZ and the 5' end or crtY). Panel C: Northern blot obtained with probe B 
(BamHI-Xhol fragment isolated from plasmid pBIIKS(+)-crtE/2 and hybridizing to the 5' part of the crtE 
gene). 

s Figure 22 : Schematic representation of the integration sites of three transformed Bacillus subtilis strains: 
BS1012::SFCO, BS1012::SFCOCAT1 and BA1012::SFCONEO1. Amplification of the synthetic Flavo- 
bacterium carotenoid operon (SFCO) can only be obtained in those strains having amplifiable structures. 
Probe A was used to determine the copy number of the integrated SFCO. Erythromycine resistance 
gene (ermAM), chloramphenicol resistance gene (cat), neomycine resistance gene (neo), terminator of 

ro the cryT gene of B. subtilis (cryT), levan-sucrase gene (sac-B 5' and sac-B 3'), plasmid sequences of 

pXI12 (pXi12), promoter originating from site I of the veg promoter complex (Pvegl). 

Figure 23 : Construction of plasmids pXI12-ZYIB-EINV4MUTRBS2CNEO and pXI12-ZYIB-EINV4MUTRBS2CCAT. 

Complete nucleotide sequence of plasmid pZea4. 

Synthetic crtW gene of Alcaligenes PC-1. The translated protein sequence is shown above the double 
stranded DNA sequence. The twelve oligonucleotides (crtW1-crtW12) used for the PCR synthesis are 
underlined. 

Construction of plasmid pBIIKS-crtEBIYZW. The Hindlll-Pm1 1 fragment of pALTER-Ex2-crtW, carrying 
the synthetic crtW gene, was cloned into the Hindlll and Mlul (blunt) sites. Pvegl and Ptac are the pro- 
moters used for the transcription of the two opera. The ColE1 replication origin of this plasmid is compat- 
ible with the p15A origin present in the pALTER-Ex2 constructs. 

Relevant inserts of all plasmids constructed in Example 7. Disrupted genes are shown by //. Restriction 
sites: S=Sacl, X=Xbal, H=Hindlll. N=Nsil, Hp=Hpal, Nd=Ndel. 



15 Figure 24 : 



Figure 25 : 



Figure 27: 



Figure 28 : Reaction products (carotenoids) obtained from p-carotene by the process of the present invention. 
Example 1 

Materials and general methods used 

Bacterial strains and plasmids: Flavobacterium sp. R1534 WT (ATCC 21 588) was the DNA source for the genes 
cloned. Partial genomic libraries of Flavobacterium sp. R1534 WT DNA were constructed into the pBluescriptll+(KS) or 
(SK) vector (Stratagene, La Jolla. USA) and transformed into E. coli XL-1 blue (Stratagene) or JM109. 

Media and growth conditions: Transformed E. coli were grown in Luria broth (LB) at 37° C with 100mg Ampicillin 
<Amp)/ml for selection. Flavobacterium sp. R1534 WT was grown at 27° C in medium containing 1% glucose, 1% tryp- 
tone (Difco Laboratories), 1% yeast extract (Difco), 0.5% MgS0 4 7H 2 0 and 3% NaCI. 

Colony screening: Screening of the E. coli transformants was done by PCR basically according to the method 
described by Zon et al. [Zon et al., BioTechniques 7, 696-698 (1989)] using the following primers: 

Primer #7: 5'-CCTGGATGACGTGCTGGAATATTCC-3' 
Primer #8: 5'-CAAGGCCCAGATCGCAGGCG-3' 

Genomic DNA: A 50 ml overnight culture of Flavobacterium sp. R1534 was centrifuged at 10,000 g for 1 0 minutes. 
The pellet was washed briefly with 10 ml of lysis buffer (50 mM EDTA, 0.1M NaCI pH7.5), resuspended in 4 ml of the 
same buffer sumplemented with 10 mg of lysozyme and incubated at 37°C for 15 minutes. After addition of 0.3 ml of N- 
Lauroyl sarcosine (20%) the incubation at 37°C was continued for another 15 minutes before the extraction of the DNA 
with phenol, phenol/chloroform and chloroform. The DNA was ethanol precipitated at room temperature for 20 minutes 
in the presence of 0.3 M sodium acetate (pH 5.2), followed by centrifugation at 10,000 g for 15 minutes. The pellet was 
rinsed with 70% ethanol, dried and resuspended in 1 ml of TE (10 mM Tris, 1mM EDTA, pH 8.0). 

All genomic DNA used in the southern blot analysis and cloning experiments was dialysed against H 2 0 for 48 
hours, using collodium bags (Sartorius, Germany), ethanol precipitated in the presence of 0.3 M sodium acetate and 
resuspended in H 2 0. 

Probe labelling: DNA probes were labeled with (a - 32 P) dGTP (Amersham) by random-priming according to 
[Sambrook et al., s.a.]. 
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Probes used to screen the mini-libraries: Probe 46F is a 1 19 bp fragment obtained by PCR using primer #7 and 
#8 and Flavobacterium sp. R1534 genomic DNA as template. This probe was proposed to be a fragment of the Flavo- 
bacterium sp. R1534 phytoene synthase (crtB) gene, since it shows significant homology to the phytoene synthase 
genes from other species (e.g. E. uredovora, E. herbicola). Probe A is a BstXI - Pstl fragment of 184 bp originating from 

5 the right arm of the insert of clone 85. Probe B is a 397 bp Xhol - Natl fragment obtained from the left end of the insert 
of clone 85. Probe C is a 536 bp Bglll - Pstl fragment from the right end of the insert of clone 85. Probe D is a 376 bp 
Kpnl - BstYl fragment isolated from the insert of clone 59. The localization of the individual probes is shown in figure 6. 

Oligonucleotide synthesis: The oligonucleotides used for PCR reactions or for sequencing were synthesized with 
an Applied Biosystems 392 DNA synthesizer. 

w Southern blot analysis: For hybridization experiments Flavobacterium sp. R1534 genomic DNA (3 mg) was 
digested with the appropiate restriction enzymes and electrophoresed on a 0.75% agarose gel. The transfer to Zeta- 
Probe blotting membranes (BIO- RAD), was done as described [Sourthern, E.M., J. Mol. Biol. 98, 503 (1975)]. Prehy- 
bridization and hybridization was in 7%SDS, 1% BSA (fraction V; Boehringer), 0.5M Na 2 HP0 4 . pH 7.2 at 65°C. After 
hybridization the membranes were washed twice for 5 minutes in 2x SSC. 1% SDS at room temperature and twice for 

15 15 minutes in 0.1% SSC, 0.1% SDS at 65° C. 

DNA sequence analysis: The sequence was determined by the dideoxy chain termination technique [Sanger et 
al., Proc. Natl. Acad. Sci. USA 74. 5463-5467 (1977)] using the Sequenase Kit (United States Biochemical). Both 
strands were completely sequenced and the sequence analyzed using the GCG sequence analysis software package 
(Version 8.0) by Genetics Computer, Inc. [Devereux et al., Nucleic Acids. Res. 12, 387-395 (1984)]. 

20 Analysis of carotenoids: E. coli XL-1 or JM109 cells (200 - 400 ml) carrying different plasmid constructs were 
grown for the times indicated in the text, usually 24 to 60 hours, in LB suplemented with 100mg Ampicillin/ml. in shake 
flasks at 37° C and 220 rpm. 

The carotenoids present in the microorganisms were extracted with an adequate volume of acetone using a rota- 
tion homogenizer (Polytron, Kinematica AG, CH-Luzern). The homogenate was the filtered through the sintered glass 

25 of a suction filter into a round bottom flask. The filtrate was evaporated by means of a rotation evaporator at 50° C using 
a water-jet vacuum. For the zeaxanthin detection the residue was dissolved in n-hexane/acetone (86:14) before analy- 
sis with a normalphase HPLC as described in [Weber. S. in Analytical Methods for Vitamins and Carotenoids in Feed, 
Keller, H.E. , Editor, 83-85 (1988)]. For the detection of p-carotene and lycopene the evaporated extract was dissolved 
in n-hexane/acetone (99:1) and analysed by HPLC as described in [Hengartner et al., Helv. Chim. Acta 75, 1848-1865 

30 (1992)]. 

Example 2 

Cloning of the Flavob acterium sp. R1534 carotenoid biosvnthetic genes. 

35 

To identify and isolate DNA fragments carrying the genes of the carotenoid biosynthesis pathway, we used the DNA 
fragment 46F (see methods) to probe a Southern Blot carrying chromosomal DNA of Flavobacterium sp. R1534 
digested with different restriction enzymes Fig. 2. The 2.4 kb Xhol/Pstl fragment hybridizing to the probe seemed the 
most appropiate one to start with. Genomic Flavobacterium sp. R1534 DNA was digested with Xhol/Pstl and run on a 

40 1 % agarose gel. According to a comigrating DNA marker, the region of about 2.4 kb was cut out of the gel and the DNA 
isolated. A Xhol/Pstl mini library of Flavobacterium sp. R1534 genomic DNA was constructed into Xhol - Pstl sites of 
pBluescriptllSK(+). One hundred E. coli XL1 transformants were subsequentely screened by PCR with primer #7 and 
primer # 8. the same primers previously used to obtain the 1 19 bp fragment (46F). One positive transformant, named 
clone 85, was found. Sequencing of the insert revealed sequences not only homologous to the phytoene synthase 

45 (crtB) but also to the phytoene desaturase (crtl) of both Erwinia species herbicola and uredovora. Left and right hand 
genomic sequences of clone 85 were obtained by the same approach using probe A and probe B. Flavobacterium sp. 
R1534 genomic DNA was double digested with Clal and Hind III and subjected to Southern analysis with probe A and 
probe B. With probe A a Clal/Hindlll fragment of aprox. 1.8 kb was identified (Fig. 3A), isolated and subcloned into the 
Clal/Hindlll sites of pBluescriptllKS (+). Screening of the E. coli XL1 transformants with probe A gave 6 positive clones. 

so The insert of one of these positives, clone 43-3, was sequenced and showed homology to the N-terminus of crtl genes 
and to the C-terminus of crtY genes of both Erwinia species mentioned above. With probe B an approx. 9.2 kb Clal/Hin- 
dlll fragment was detected (Fig. 3B), isolated and subcloned into pBluescriptllKS (+). 

A screening of the transformants gave one positive, clone 51 . Sequencing of the 5" and 3' of the insert, revealed 
that only the region close to the Hindlll site showed relevant homology to genes of the carotenoid biosynthesis of the 

55 Erwinia species mentioned above (e.g. crtB gene and crtE gene). The sequence around the Clal site showed no homol- 
ogy to known genes of the carotenoid biosynthesis pathway. Based on this information and to facilitate further sequenc- 
ing and construction work, the 4.2 kb BamHI/Hindlll fragment of clone 51 was subcloned into the respective sites of 
pBluescriptUKS(+) resulting in clone 2. Sequencing of the insert of this clone confirmed the presence of genes homol- 
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ogous to Erwinia sp. crtB and crtE genes. These genes were located within 1 .8 kb from the Hindlil site. The remaining 
2.4 kb of this insert had no homology to known carotenoid biosynthesis genes. 

Additional genomic sequences downstream of the Clal site were detected using probe C to hybridize to Flavobac- 
terium sp. R1534 genomic DNA digested with different restriction enzymes (see figure 4). 

5 A Sall/Hindlll fragment of 2.8 kb identified by Southern analysis was isolated and subcloned into the Hindi ll/Xhol 

sites of pBluescriptllKS (+). Screening of the E. coli XL1 transformarrts with probe A gave one positive clone named 
clone 59. The insert of this clone confirmed the sequence of clone 43-3 and contained in addition sequences homolo- 
gous to the N-terminus of the crtY gene from other known lycopene cyclases. To obtain the putative missing crtZ gene 
a Sau3AI partial digestion library of Flavobactehum sp. R1534 was constructed into the BamHI site of pBluescriptl- 

10 IKS(+). Screening of this library with probe D gave several positive clones. One transformant designated, clone 6a, had 
an insert of 4.9 kb. Sequencing of the insert revealed besides the already known sequences coding for crtB, crtl and 
crtY also the missing crtZ gene. Clone 7g was isolated from a mini library carrying Bcll/Sphl fragments of R1534 (Fig. 
5) and screened with probe D. The insert size of clone 7g is approx. 3 kb. 

The six independent inserts of the clones described above covering approx. 1 4 kb of the Flavobacterium sp. R1 534 

»5 genome are compiled in Figure 6. 

The determined sequence spanning from the BamHI site (position 1) to base pair 8625 is shown figure 7. 

Putative protein coding regions of the cloned R1534 sequence. 

20 Computer analysis using the CodonPreference program of the GCG package, which recognizes protein coding 
regions by virtue of the similarity of their codon usage to a given codon frequency table, revealed eight open reading 
frames (ORFs) encoding putative proteins: a partial ORF from 1 to 1 165 (ORF-5) coding for a polypeptide larger than 
41382 Da; an ORF coding for a polypeptide with a molecular weight of 40081 Da from 1 180 to 2352 (ORF-1); an ORF 
coding for a polypeptide with a molecular weight of 31331 Da from 2521 to 3405 (crtE); an ORF coding for a polypeptide 

25 with a molecular weight of 32615 Da from 4316 to 3408 (crtB); an ORF coding for a polypeptide with a molecular weight 
of 5441 1 Da from 5797 to 4316 (crtl); an ORF coding for a polypeptide with a molecular weight of 42368 Da from 6942 
to 5797 (crtY); an ORF coding for a polypeptide with a molecular weight of 19282 Da from 7448 to 6942 (crtZ); and an 
ORF coding for a polypeptide with a molecular weight of 19368 Da from 8315 to 7770 (ORF-16); ORF-1 and crtE have 
the opposite transcriptional orientation from the others (Fig. 6). The translation start sites of the ORFs crtl, crtY and crtZ 

30 could clearly be determined based on the appropiately located sequences homologous to the Shine/Delgano (S/D) 
[Shine and Dalgarno, Proc. Natl. Acad. Sci. USA 71, 1342-1346 (1974)] consensus sequence AGG-6-9N--ATG (Fig. 
10) and the homology to the N-terminal sequences of the respective enzymes of E. herbicola and E. uredovora. The 
translation of the ORF crtB could potentially start from three closely spaced codons ATG (4316), ATG (4241) and ATG 
(421 1). The first one, although not having the best S/D sequence of the three, gives a translation product with the high- 

35 est homology to the N-terminus of the E. herbicola and E. uredovora crtB protein, and is therefore the most likely trans- 
lation start site. The translation of ORF crtE could potentially start from five different start codons found within 150 bp : 
ATG (2389), ATG (2446), ATG (2473), ATG (2497) and ATG (2521). We believe that based on the following observa- 
tions, the ATG (2521 ) is the most likely transcription start site of crtE: this ATG start codon is preceeded by the best con- 
sensus S/D sequence of all five putative start sites mentioned; and the putative N-terminal amino acid sequence of the 

40 protein encoded has the highest homology to the N-terminus of the crtE enzymes of E. herbicola and E. uredovora; 

Characteristics of the crt translational initiation sites and gene products. 

The translational start sites of the five carotenoid biosynthesis genes are shown below and the possible ribosome 
45 binding sites are underlined. The genes crtZ, crtY, crtl and crtB are grouped so tightly that the TGA stop codon of the 
anterior gene overlaps the ATG of the following gene. Only three of the five genes (crtl, crtY and crtZ) fit with the con- 
sensus for optimal S/D sequences. The boxed TGA sequence shows the stop condon of the anterior gene. 
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ACG AAGGC ACCGATG ACGCCCA 


crtE 


CGGACCTGGC CGT CGC AgSgCCGATC 


crtB 






CGGATCGCAA TAC ?fTGplGC C ATG 


crtY 


CTGCAGGAGAGAGCA^^GTTCCG 


crtl 






GCAAGGGGCCGGCATGAGCAC TT 


crtZ 



Amino acid sequences of individual crt genes of Flavobacterium sp. R1534. 

All five ORFs of Flavobacterium sp. R1 534 having homology to known carotenoid biosynthesis genes of other spe- 
cies are clustered in approx. 5.2 kb of the sequence (Fig. 7). 

GGDP synthase (crtE) 

The amino acid (aa) sequence of the geranylgeranyl pyrophosphate synthase (crtE gene product) consists of 295 
aa and is shown in figure 8. This enzyme condenses farnesyl pyrophosphate and isopentenyl pyrophosphate in a 1 ' - 4. 

Phytoene synthase (crtB) 

This enzyme catalyzes two enzymatic steps. First it condenses in a head to head reaction two geranylgeranyl pyrophos- 
phates (C20) to the C40 carotenoid prephytoene. Second it rearanges the cyclopropylring of prephytoene to phytoene. 
The 303 aa encoded by the crtB gene of Flavobacterium sp. R1534 is shown in figure 9. 

Phytoene desaturase (crtl) 

The phytoene desaturase of Flavobacterium sp. R1534 consisting of 494 aa. shown in figure 10, performs like the 
crtl enzyme of E. herbicola and E. uredovora, four desaturation steps, converting the non-coloured carotenoid phy- 
toene to the red coloured lycopene. Lycopene cyclase (crtY) 

The crtY gene product of Flavobacterium sp. R1534 is sufficient to introduce the b-ionone rings at both sides of 
lycopene to obtain p-carotene. The lycopene cyclase of Flavobacterium sp. R1534 consists of 382 aa (Fig. 11). p-car- 
otene hydroxylase (crtZ) 

The gene product of crtZ consisting of 169 aa (Fig. 12) and hydroxylates p-carotene to the xanthophyll zeaxanthin. 
Putative enzymatic functions of the ORF's (orf-1 , orf-5 and orf-16) 

The orf-1 has at the aa level over 40% identity to acetoacetyl-CoA thiolases of different organisms (e.g. Candida 
tropicalis, human, rat). This gene is therefore most likely a putative acetoacetyl-CoA thiolase (acetyl-CoA acetyltrans- 
ferase), which condenses two molecules of acetyl-CoA to Acetoacetyl-CoA. Condensation of acetoacetyl-CoA with a 
third acetyl-CoA by the HMG-CoA synthase forms p-hydroxy-p-methylglutaryl-CoA (HMG-CoA). This compound is part 
of the mevalonate pathway which produces besides sterols also numerous kinds of isoprenoids with diverse cellular 
functions. In bacteria and plants, the isoprenoid pathway is also able to synthesize some unique products like caroten- 
oids, growth regulators (e.g. in plants gibberellins and abcissic acid) and sencodary metabolites like phytoalexins [Riou 
et al., Gene 148, 293-297 (1994)]. 

The orf-5 has a low homology of approx. 30%. to the amino acid sequence of polyketide synthases from different 
streptomyces (e.g. S. violaceoruber, S. cinnamonensis). These antibiotic synthesizing enzymes (polyketide synthases), 
have been classified into two groups. Type-I polyketide synthases are large multifunctional proteins, whereas type-ll 
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polyketide synthases are multiprotein complexes composed of several individual proteins involved in the subreactions 
of the polyketide synthesis [Bibb, et al. Gene 142, 31-39 (1994)]. 

The putative protein encoded by the orf-16 has at the aa level an identity of 42% when compared to the soluble 
hydrogenase subunit of Anabaena cylindrica. 

Functional assignment of the ORF s (crtE, crtB, crtl, crtY and crtZ) to enzymatic activities of the carotenoid 
biosynthesis pathway. 

The biochemical assignment of the gene products of the different ORF's were revealed by analyzing carotenoid 
accumulation in E. coli host strains that were transformed with deleted variants of the Flavobacterium sp. gene cluster 
and thus expressed not all of the crt genes (Fig. 13). 

Three different plasmid were constructed: pLyco, p59-2 and pZea4. Plasmid p59-2 was obtained by subcloning the 
Hindlll/BamHI fragment of clone 2 into the Hindlll/BamHI sites of clone 59. p59-2 carries the ORF's of the crtE. crtB, 
crtl and crtY gene and should lead to the production of p-carotene. pLyco was obtained by deleting the Kpnl/Kpnl frag- 
ment, coding for approx. one half (N-terminus) of the crtY gene, from the p59-2 plasmid. E, coli cells transformed with 
pLyco, and therefore having a truncated non-functional crtY gene, should produce lycopene, the precursor of p-caro- 
tene. pZea4 was constructed by ligation of the Ascl-Spel fragment of p59-2, containing the crtE, crtB, crtl and most of 
the crtY gene with the Ascl/Xbal fragment of done 6a, containing the sequences to complete the crtY gene and the crtZ 
gene. pZea4 [for complete sequence see Fig. 24; nucleotides 1 to 683 result from pBluescriptllKS(+), nucleotides 684 
to 8961 from Flavobacterium R1 534 WT genome, nucleotides 8962 to 1 1 233 from pBluescriptllKS(+)] has therefore all 
five ORF's of the zeaxanthin biosynthesis pathway. Plasmid pZea4 has been deposited on May 25, 1995 at the DSM- 
Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (Germany) under accession No. DSM 10012. E. 
coli cells transformed with this latter plasmid should therefore produce zeaxanthin. For the detection of the carotenoid 
produced, transformants were grown for 48 hours in shake flasks and then subjected to carotenoid analysis as 
described in the methods section. Figure 13 summarizes the different inserts of the plasmids described above, and the 
main carotenoid detected in the cells. 

As expected the pLyco carrying E. coli cells produced lycopene, those carrying p59-2 produced p-carotene (all- 
E,9-Z,13-Z) and the cells having the pZea4 construct produced zeaxanthin. This confirms that all the necessary genes 
of Flavobacterium sp. R1534 for the synthesis of zeaxanthin or their precursors (phytoene, lycopene and p-carotene) 
were cloned. 



Materials and methods used for expression of carotenoid synthesizing enzymes 

Bacterial strains and plasmids: The vectors pBluescriptllKS (+) or (-) (Stratagene, La Jolla, USA) and pUC18 
[Vieira and Messing, Gene 19, 259-268 (1982); Norrander et al., Gene 26, 101-106 (1983)] were used for cloning in dif- 
ferent E. coli strains, like XL-1 blue (Stratagene), TG1 or JM109. In all B. subtilis transformations, strain 1012 was used. 
Plasmids pHP13 [Haima et al., Mol. Gen. Genet. 209, 335-342 (1987)] and p602/22 [LeGrice, S.F.J, in Gene Expres- 
sion Technology, Goeddel, D.V., Editor, 201-214 (1990)] are Gram (+)/(-) shuttle vectors able to replicate in B. subtilis 
and E. coli cells. Plasmid p205 contains the vegl promoter cloned into the Smal site of pUC18. Plasmid pXI12 is an inte- 
gration vector for the constitutive expression of genes in 8. subtilis [Haiker et al., in 7th Int. Symposium on the Genetics 
of Industrial Microorganisms, June 26-July 1, 1994. Mongreal. Quebec, Canada (1994)]. Plasmid pBEST501 [Itaya et 
al., Nucleic Acids Res. 17 (11), 4410 (1989)] contains the neomycin resistance gene cassette originating from the plas- 
mid pUB1 10 (GenBank entry: M19465) of S. aureus [McKenzie et al., Plasmid 15, 93-103 (1986); McKenzie et al., Plas- 
mid 17, 83-84 (1987)]. This neomycin gene has been shown to work as a selection marker when present in a single 
copy in the genome of B. subtilis. Plasmid pCl94 (ATCC 37034)(GenBank entry: L08860) originates from S. aureus 
[Horinouchi and Weisblaum, J. Bacterid. 150, 815-825 (1982)] and contains the chloramphenicol acetyltransferase 
gene. 

Media and growth conditions: E. coli were grown in Luria broth (LB) at 37° C with 1 0Omg Ampicillin (Amp)/ml for 
selection. B. subtilis cells were grown m VY-medium supplemented with either erythromycin (1 mg/ml), neomycin (5- 
180 mg/ml) or chloramphenicol (10-80 mg/ml). 

Transformation: E. coli transformations were done by electroporation using the Gene-pulser device of BIO- RAD 
(Hercules, CA, USA) with the following parameters (200 W, 250 mFD, 2.5V). B.subtilis transformations were done basi- 
cally according to the standard procedure method 2.8 described by [Cutting and Vander Horn in Molecular Biological 
Methods for Bacillus, Harwood, C.R. and Cutting, S.M., Editor, John Wiley & Sons: Chichester, England. 61 -74 (1990)]. 

Colony screening: Bacterial colony screening was done as described by [Zon et al., s.a.]. 

Oligonucleotide synthesis: The oligonucleotides used for PCR reactions or for sequencing were synthesized with 
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an Applied Biosystems 392 DNA synthesizer. 

PCR reactions: The PCR reactions were performed using either the UlTma DNA polymerase (Perkin Elmer 
Cetus) or the Pfu Vent polymerase (New England Biolabs) according to the manufacturers instructions. A typical 50 
ml PCR reaction contained: "lOOng of template DNA, 10 pM of each of the primers, ail four dNTP's (final cone. 300 mM), 

5 MgCI 2 (when UlTma polymerase was used; final cone. 2 mM), 1x UlTma reaction buffer or 1x Pfu buffer (supplied by 
the manufacturer). All components of the reaction with the exception of the DNA polymerase were incubated at 95°C 
for 2 min. followed by the cycles indicated in the respective section (see below). In all reactions a hot start was made, 
by adding the polymerase in the first round of the cycle during the 72°C elongation step. At the end of the PCR reaction 
an aliquot was analysed on 1% agarose gel, before extracting once with phenol/chloroform. The amplified fragment in 

10 the aqueous phase was precipitated with 1/10 of a 3M NaAcetate solution and two volumes of Ethanol. After centrifu- 
gation for 5 min. at 12000 rpm, the pellet was resuspended in an adequate volume of H 2 0. typically 40 ml, before diges- 
tion with the indicated restriction enzymes was performed. Afier the digestion the mixture was separated on a 1% low 
melting point agarose. The PCR product of the expected size were excised from the agarose and purified using the 
glass beads method (GENECLEAN KIT, Bio 101, Vista CA, USA) when the fragments were above 400 bp or directly 

is spun out of the gel when the fragments were shorter than 400 bp as described by [Heery et al., TIBS 6 (6), 1 73 (1 990)]. 

Oligos used for gene amplification and site directed mutagenesis: 

All PCR reactions performed to allow the construction of the different plasmids are described below. All the primers 

20 used are summarized in figure 14. 

Primers #100 and #101 were used in a PCR reaction to amplify the complete crtE gene having a Spel restriction 
site and an artificial ribosomal binding site (RBS) upstream of the transcription start site of this gene. At the 3' end of 
the amplified fragment, two unique restriction sites were introduced, an Avrll and a Smal site, to facilitate the further 
cloning steps. The PCR reaction was done with UlTma polymerase using the following conditions for the amplification: 

25 5 cycles with the profile: 95°C, 1 min./ 60°C, 45 sec./ 72°C, 1 min. and 20 cycles with the profile: 95°C, 1 min./ 72°C, 1 
min.. Plasmid pBIIKS(+)-clone2 served as template DNA. The final PCR product was digested with Spel and Smal and 
isolated using the GENECLEAN KIT. The size of the fragment was approx. 910 bp. 

Primers #104 and #105 were used in a PCR reaction to amplify the crtZ gene from the translation start till the Sail 
restriction site, located in the coding sequence of this gene. At the 5' end of the crtZ gene an EcoRI, a synthetic RBS 

30 and a Ndel site was introduced. The PCR conditions were as described above. Plasmid pBIIKS(+)-clone 6a served as 
template DNA and the final PCR product was digested with EcoRI and Sail. Isolation of the fragment of approx. 480 bp 
was done with the GENECLEAN KIT. 

Primers NIUT1 and MUT5 were used to amplify the complete crtY gene. At the 5' end, the last 23 nucleotides of the 
crtZ gene including the Sail site are present, followed by an artificial RBS preceding the translation start site of the crtY 

35 gene. The artificial RBS created includes a Pmll restriction site. The 3' end of the amplified fragment contains 22 nucle- 
otides of the crtl gene, preceded by an newly created artifial RBS which contains a Muni restriction site. The conditions 
used for the PCR reaction were as described above using the following cycling profile: 5 rounds of 95°C, 45 sec./ 60°C, 
45 sec./ 72°C, 75 sec. followed by 22 cycles with the profile: 95°C, 45 sec./ 66°C, 45 sec./ 72°C, 75 sec. Plasmid 
pXI12-ZYIB-EINV4 served as template for the Pfu Vent polymerase. The PCR product of 1225 bp was made blunt and 

40 cloned into the Smal site of pUC1 8, using the Sure-Clone Kit (Pharmacia) according to the manufacturer. 

Primers MUT2 and MUT6 were used to amplify the complete crtl gene. At the 5' the last 23 nucleotides of the crtY 
gene are present, followed by an artificial RBS which precedes the translation start site of the crtl gene. The new RBS 
created, includes a Muni restriction site. The 3' end of the amplified fragment contains the artificial RBS upstream of the 
crtB gene including a BamHI restriction site. The conditions used for the PCR reaction were basically as described 

45 above including the following cycling profile: 5 rounds of 95°C, 30 sec./ 60°C, 30 sec./ 72°C, 75 sec, followed by 25 
cycles with the profile: 95°C, 30 sec./ 66°C, 30 sec/ 72°C, 75 sec. Plasmid pXM 2-ZYIB-EINV4 served as template for 
the Pfu Vent polymerase. For the further cloning steps the PCR product of 1541 bp was digested with Muni and BamHI. 

Primers MUT3 and CAR17were used to amplify the N-terminus of the crtB gene. At the 5' the last 28 nucleotides 
of the crtl gene are present followed by an artificial RBS, preceding the translation start site of the crtB gene. This new 

so created RBS, includes a BamHI restriction site. The amplified fragment, named PCR-F contains also the Hindlll restric- 
tion site located at the N-terminus of the crtB gene. The conditions used for the PCR reaction were as described else- 
where in the text, including the following cycling profile: 5 rounds of 95°C, 30 sec./ 58°C, 30 sec./ 72°C, 20 sec. followed 
by 25 cycles with the profile: 95°C, 30 sec./ 60°C, 30 sec./ 72°C, 20 sec. Plasmid pXI12-ZYIB-EINV4 served as tem- 
plate for the Pfu Vent polymerase. The PCR product of approx. 1 60 bp was digested with BamHI and Hindlll. 

55 

Olig s used to amplify the chloramphenicol resistance gene (cat): 

Primers CAT3 and CAT4 were used to amplify the chloramphenicol resistance gene of pC1 94 (ATCC 37034) [Hori- 
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nouchi and Weisblum, s.a.] a R-plasmid found in S. aureus. The conditions used tor the PCR reaction , were as 
described previously including the following cycling profile: 5 rounds of 95°C, 60 sec./ 50°C , 60 sec./ 72 C, 2 miir fol- 
lowed by 20 cycles with the profile: 95°C, 60 sec./ 60°C, 60 sec./ 72°C, 2 min.. Plasmid P C198 served as template for 
the Pfu Vent polymerase. The PCR product of approx. 1050 bp was digested with EcoRI and Aatll. 

Oliaos used to generate linkers: Linkers were obtained by adding 90 ng of each of the two corresponding primers 
into an Eppendorf tube. The mixture was dried in a speed vac and the pellet resuspended in 1x Ligation bufler(Boe- 
hringer, Mannheim, Germany). The solution was incubated at 50°C for 3 min. before cooling down to RT to allow the 
primers to hybridize properly. The linker were now ready to be ligated into the appropriate sites. All the ol.gos used to 
generate linkers are shown in figure 15. 

Primers CS1 and CS2 were used to form a linker containing the following restrictions sites Hindlll, Aflll, Seal, Xbal, 

Pme phmer E s C MUT7 and MUT8 were used to form a linker containing the restriction sites Sail, Avrll, Pmll, Mlul, Muni, 
BamHI, Sphl and Hindlll. 

Primers MUT9 and MUT10 were used to introduce an artificial RBS upstream of crtY. 

Primers MUT1 1 and MUT1 2 were used to introduce an artificial RBS upstream of crtE. 

isolation of RNA: Total RNA was prepared from log phase growing B. subtilis according to the method desenbed 
by [Maes and Messens, Nucleic Acids Res. 20 (16), 4374 (1992)]. _u_ ae nn a 

Northern Blot analysis: For hybridization experiments up to 30 mg of B. subtihs RNA was electrophoreses on a 
1% agarose gel made up in 1x MOPS and 0.66 M formaldehyde. Transfer to Zeta-Probe blottrng membranes (BIO- 
RAD), UV cross-linking, pre-hybridization and hybridization was done as described elsewhere in [Farrell, J.RJE RNA 
Methodologies. A laboratory Guide for isolation and characterization. San Diego, USA: Academic Press jO^M- ?» 
washing conditions used were: 2 x 20 min. in 2xSSPE/0.1% SDS foltowed by 1 x 20 mm. in 0.1% SSPE/0.1% SDS at 
65°C. Northern blots were then analyzed either by a Phosphorimager (Molecular Dynamics) or by autoradiography on 

* ^Iso'lation of genomic DNA: B. subtilis genomic DNA was isolated from 25 ml overnight cultures according to the 
standard procedure method 2.6 described by [13]. n „ A la . ^ ,.,: +h lho 

Southern blot analysis: For hybridization experiments B. subtilis genomic DNA (3 mg) was digested wrth the 
appropriate restriction enzymes and electrophoresed on a 0.75% agarose gel. The transfer to Zeta-Probe blotting 
membranes (BIO-RAD). was done as described [Southern, E.M., s.a.]. Prehybridization and hybnd.zation was «n 
7%SDS, 1% BSA (fraction V; Boehringer), 0.5M Na 2 HP0 4 . P H 7.2 at 65°C. After hybridization the membranes were 
washed twice for 5 min. in 2x SSC. 1% SDS at room temperature and twice for 15 mm. m 0.1% SSC, 0 1 /« SDS at 65 
C. Southern blots were then analyzed either by a Phosphorimager (Molecular Dynamics) or by autoradiography on X- 

ray films from Kodak. ± . . ro r . 

DNA sequence analysis: The sequence was determined by the dideoxy chain term.nat.on technique [Sanger et 
; al s a ] using the Sequenase Kit Version 1 .0 (United States Biochemical). Sequence analysis were done using the 
GCG sequence analysis software package (Version 8.0) by Genetics Computer, Inc. [Devereux et al s.a.]. 

Gene amplification in B. subtilis: To amplify the copy number of the SFCO in B. subtilis transformants, a single 
colony was inoculated in 1 5 ml VY-medium supplemented with 1.5% glucose and 0.02 mg chloramphenicol or neomy- 
cin/ml dependend on the antibiotic resistance gene present in the amplrtiable structure (see results and discussion^ 
i The next day 750 ml of this culture were used to inoculate 1 3 ml VY-medium containing 1 .5% glucose supplemented 
with (60, 80, 120 and 150 mg/ml) for the cat resistant mutants, or 160 mg/ml and 180 mg/ml for the neomycin resistant 
mutants). The cultures were grown overnight and the next day 50 ml of different dilutions (1 : 20, 1 :400, 1 : 800a 1 
160'000) were plated on VY agar plates with the appropriate antibiotic concentration. Large single colonies were then 
further analyzed to determine the number of copies and the amount of carotenoids produced. 
5 Analysis of carotenoids: E. coli or B. subtilis transformants (200 - 400 ml) were grown for the times indicated in 
the text, usually 24 to 72 hours, in LB-medium or VY-medium. respectively, supplemented with ant,b.otics, in shake 
flasks at 37° C and 220 rpm. t 

The carotenoids produced by the microorganisms were extracted with an adequate volume of acetone using a rota- 
tion homogenizer (Polytron, Kinematica AG. CH-Luzern). The homogenate was the filtered through the sintered glass 
o of a suction filter into a round bottom flask. The filtrate was evaporated by means of a rotation evaporator at 50 C using 
a water-jet vacuum. For the zeaxanthin detection the residue was dissolved in n-hexane/acetone (86:14) before analy- 
sis with a normalphase HPLC as described in [Weber. S.. s.a.]. For the detection of p-carotene and lycopene the evap- 
orated extract was dissolved in n-hexane/acetone (99:1) and analysed by HPLC as described in Hengartner et al., s.a.]. 
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Carotenoid production in E. coli 

The biochemical assignment of the gene products of the different open reading frames (ORF's) of the carotenoid 
biosynthesis cluster of Flavobacterium sp. were revealed by analyzing the carotenoid accumulation in E. coli host 
strains, transformed with plasmids carrying deletions of the Flavobacterium sp. gene cluster, and thus lacking some of 
the crt gene products. Similar functional assays in E. coli have been described by other authors [Misawa et al., s.a.; 
Perry etal., J. Bacteriol., 168, 607-612 (1986); Hundle, etal., Molecular and General Genetics 254 (4), 406-416(1994)]. 
Three different plasmid pLyco, pBIIKS(+)-clone59-2 and pZea4 were constructed from the three genomic isolates pBI- 
IKS(+)-clone2, pBIIKS(+)-cloneS9 and pBIIKS(+)-clone6a (see figure 16). 

Plasmid pBIIKS(+)-clone59-2 was obtained by subcloning the Hindlll/BamHI fragment of pBIIKS(+)-clone 2 into the 
Hindi ll/BamHI sites of pBIIKS(+)-clone59. The resulting plasmid pBIIKS(+)-clone59-2 carries the complete ORF's of 
the crtE, crtB, crtl and crtY gene and should lead to the production of p-carotene. pLyco was obtained by deleting the 
Kpnl/Kpnl fragment, coding for approx. one half (N-terminus) of the crtY gene, from the plasmid pBIIKS(+)-clone59-2. 
E. coli cells transformed with pLyco, and therefore having a truncated non-functional crtY gene, should produce lyco- 
pene, the precursor of p-carotene. pZea4 was constructed by ligation of the Ascl-Spel fragment of pBIIKS(+)-c!one59- 
2, containing the crtE, crtB, crtl and most of the crtY gene with the Ascl/Xbal fragment of clone 6a, containing the crtZ 
gene and sequences to complete the truncated crtY gene mentioned above. pZea4 has therefore all five ORF's of the 
zeaxanthin biosynthesis pathway. E. coli cells transformed with this latter plasmid should therefore produce zeaxanthin. 
For the detection of the carotenoid produced, transformants were grown for 43 hours in shake flasks and then subjected 
to carotenoid analysis as described in the methods section. Figure 16 summarizes the construction of the plasmids 
described above. 

As expected the pLyco carrying E. coli cells produced lycopene, those carrying pBIIKS(+)-clone59-2 produced p- 
carotene (all-E,9-Z,13-Z) and the cells having the pZea4 construct produced zeaxanthin. This confirms that we have 
cloned all the necessary genes of Flavobacterium sp. R1534 for the synthesis of zeaxanthin or their precursors (phy- 
toene, lycopene and p-carotene). The production levels obtained are shown in table 1. 



plasmid 


host 


zeaxanthin 


P-carotene 


lycopene 


pLyco 


£. coli JM109 


ND 


ND 


0.05% 


pBIIKS(i-)-clone59-2 




ND 


0.03% 


ND 


pZea4 




0.033% 


0.0009% 


ND 



Table 1: Carotenoid content of E. coli transformants, carrying the plasmids 
pLyco, pBIIKS(+)-clone59-2 and pZea4, after 43 hours of culture 
in shake flasks. The values indicated show the carotenoid content 
in % of the total dry cell mass (200 ml). ND = not detectable. 



Examples 5 

Carotenoid production in B. subtilis 

In a first approach to produce carotenoids in B. subtilis, we cloned the carotenoid biosynthesis genes of 
Flavobacterium into the Gram (+)/(-) shuttle vectors p602/22, a derivative of p602/20 [LeGrice, S.F.J., s.a.]. The assem- 
bling of the final construct p602-CARVEG-E, begins with a triple ligation of fragments Pvull-Avrll of pZea4(del654- 
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3028) and the Avrll-EcoRI fragment from plasmid pBIIKS(+)-clone6a, into the EcoRI and Seal sites of the vector 
P602/22. The plasmid pZea4(del654-3028) had been obtained by digesting p2ea4 with Sad and Espl. The protruding 
and recessed ends were made blunt with Klenow enzyme and religated. Construct pZea4(del654-3028) lacks most of 
the sequence upstream of crtE gene, which are not needed for the carotenoid biosynthesis. The plasmid p602-CAR has 

5 approx. 6.7 kb of genomic Flavobacterium R1534 DNA containing besides all five carotenoid genes (approx. 4.9 kb), 
additional genomic DNA of 1 .2 kb, located upstream of the crtZ translation start site and further 200 bp, located 
upstream of crtE transcription start. The crtZ, crtY, crtl and crtB genes were cloned downstream of the Pn25/o promoter, 
a regulatable £ coli bacteriophage T5 promoter derivative, fused to a lac operator element, which is functional in B. 
subtilis [LeGrice, S.F.J., s.a.]. It is obvious that in the p602CAR construct, the distance of over 1200 bp between the 

10 Pn25/o promoter and the transcription start site of crtZ is not optimal and will be improved at a later stage. An outline of 
the p602CAR construction is shown in figure 1 7. To ensure transcription of the crtE gene in B. subtilis, the vegl pro- 
moter [Moran et al., Mol. Gen. Genet. 186, 339-346 (1982); LeGrice et al., Mol. Gen. Genet. 204, 229-236 (1986)] was 
introduced upstream of this gene, resulting in the plasmid construct p602-CARVEG-E. The vegl promoter, which origi- 
nates from sitel of the veg promoter complex described by [LeGrice et al., s.a.] has been shown to be functional in £ 

r5 coli [Moran et al., s.a.]. To obtain this new construct, the plasmid p602CAR was digested with Sail and Hindlll, and the 
fragment containing the complete crtE gene and most of the crtB coding sequence, was subcloned into the Xhol and 
Hindi II sites of plasmid p205. The resulting plasmid p205CAR contains the crtE gene just downstream of the Pvegl pro- 
moter. To reconstitute the carotenoid gene cluster of Flavobacterium sp. the following three pieces were isolated: 
Pmel/Hindlll fragment of p205CAR, the Hincll/Xbal fragment and the EcoRI/Hindlll fragment of p602CAR and ligated 

20 into the EcoRI and Xbal sites of pBluescriptllKS(+), resulting in the construct pBIIKS(+)-CARVEG-E. Isolation of the 
EcoRI-Xbal fragment of this latter plasmid and ligation into the EcoRI and Xbal sites of p602/22 gives a plasmid similar 
to p602CAR but having the crtE gene driven by the Pvegl promoter. All the construction steps to get the plasmid 
p602CARVEG-E are outlined in figure 18. £ coli TG1 cells transformed with this plasmid synthesized zeaxanthin. In 
contrast B. subtilis strain 1012 transformed with the same constructs did not produce any carotenoids. Analysis of sev- 

2s eral zeaxanthin negative B. subtilis transformants always revealed, that the transformed plasmids had undergone 
severe deletions. This instability could be due to the large size of the constructs. 

In order to obtain a stable construct in a subtilis, the carotenoid genes were cloned into the Gram (+)/(-) shuttle 
vector pHP13 constructed by [Haima et al., s.a.]. The stability problems were thought to be omitted by 1) reducing the 
size of the cloned insert which carries the carotenoid genes and 2) reversing the orientation of the crtE gene and thus 

30 only requiring one promoter for the expression of all five genes, instead of two, like in the previous constructs. Further- 
more, the ability of cells transformed by such a plasmid carrying the synthetic Flavobacterium carotenoid operon 
(SFCO), to produce carotenoids, would answer the question if a modular approach is feasible. Figure 19 summarizes 
all the construction steps and intermediate plasmids made to get the final construct pHP1 3-2PNZYIB-EINV. Briefly: To 
facilitate the following constructions, a vector pHP13-2 was made, by introducing a synthetic linker obtained with primer 

35 CS1 and CS2, between the Hindlll and EcoRI sites of the shuttle vector pHP13. The intermediate construct pHP13- 
2CARVEG-E was constructed by subcloning the Aflll-Xbal fragment of p602CARVEG-E into the Af III and Xbal sites of 
pHP13-2. The next step consisted in the inversion of crtE gene, by removing Xbal and Avrll fragment containing the 
original crtE gene and replacing it with the Xbal-Avrll fragment of plasmid pBIIKS(+)-PCRRBScrtE. The resulting plas- 
mid was named pHP13-2CARZYIB-EINV and represented the first construction with a functional SFCO. The interme- 

40 diate construct pBIIKS(+)-PCRRBScrtE mentioned above, was obtained by digesting the PCR product generated with 
primers #100 and #101 with Spel and Smal and ligating into the Spel and Smal sites of pBluescriptllKS(+). In order to 
get the crtZ transcription start close to the promoter Pn 25/ o a triple ligation was done with the BamHI-Sall fragment of 
PHP13-2CARZYIB-EINV (contains four of the five carotenoid genes), the BamHI-EcoRI fragment of the same plasmid 
containing the P N25 /o promoter and the EcoRI-Sall fragment of pBIIKS(+)-PCRRBScrtZ, having most of the crtZ gene 

45 preceded by a synthetic RBS. The aforementioned plasmid pBIISK(+)-PCRRBScrtZ was obtained by digesting the 
PCR product amplified with primers #104 and #105 with EcoRI and Sail and ligating into the EcoRI and Sail sites of 
pBluescriptllSK(+). In the resulting vector pHP13-2PN25ZYIB-EINV, the SFCO is driven by the bacteriophage T5 pro- 
moter P N2 5/o. wn 'ch should be constitutively expressed, due to the absence of a functional lac repressor in the construct 
[Peschke and Beuk, J. Mol. Biol. 186. 547-555 (1 985)]. £ coli TG1 cells transformed with this construct produced zeax- 

50 anthin. Nevertheless, when this plasmid was transformed into B. subtilis, no carotenoid production could be detected. 
Analysis of the plasmids of these transformants showed severe deletions, pointing towards instability problems, similar 
to the observations made with the aforementioned plasmids. 

Exam ples 6 

55 

Chromosome Integration Constructs 

Due to the instability observed with the previous constructs we decided to integrate the carotenoid biosynthesis 
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subtilis strain obtained was named 8S1012::SFC01 . The last Flavobacterium RBS to be exchanged was the one pre- 
ceding the crtE gene. This was done using a linker obtained using primer MUT1 1 and MUT12. The wild type RBS was 
removed from pXI12-ZYIB-EINV4MUTRBS with Ndel and Spel and the above mentioned linker was inserted. In the 
construct pXI12-ZYIB-EINV4MUTRBS2C all Flavobacterium RBS's have been replaced by synthetic RBS's of the con- 
sensus sequence AAAQGAGG- 7-8 N -ATG (see table 2). £ coli TG1 cells transformed with this construct showed that 
also this last RBS replacement had not interfered 



Table? 



inRNA 


nucleotide sequence 


crtZ 


AAA G G AG GGUTJUC AU AUG AGC 


crtY 


AAA G G AG G AC ACGUG AUG AGC 


crtl 


AAAGGAGGCAAUUGAGAjJGAGU 


crtB 


AAAGGAGGAUCCAAUCAUGACC 


crtE 


A A AGG AG GGUUUCUU AJJQ ACG 



B. subtilis 16S rRNA 3'-UCUUUCCUCCACUAG 

E. coli 16S rRNA 3'- AUUCCUCCACUAG 

Table 2: Nucleotide sequences of the synthetic ribosome binding 
sites in the constructs pXI12-ZYIB-EINV4MUTRBS2C 
pXI12-ZYIB-EINV4MUTRBS2CCAT and pXI12-ZYIB- 
EINV4 MUTRBS2CNEO. Nucleotides of the Shine- 
Dalgamo sequence preceding the individual carotenoid 
genes which are complementary to the 3' ends of the 16S 
rRNA of B. subtilis are shown in bold. The 3' ends of the 16S 
rRNA of E. coli is also shown as comparison. The 
underlined AUG is the translation start site of the 
mentioned gene. 

with the ability to produce zeaxanthin. All the regions containing the newly introduced synthetic RBS's were confirmed 
by sequencing. S. subtilis cells were transformed with plasmid pXI12-ZYIB-EINV4MUTRBS2 and one transformant 
having integrated the SFCO by reciprocal recombination, into the levan-sucrase gene of the chromosome, was 
selected. This strain was named BS1012::SFC02. Analysis of the carotenoid production of this strain show that the 
amounts zeaxanthin produced is approx. 40% of the zeaxanthin produced by E. coli cells transformed with the plasmid 
used to get the B. subtilis transformant. Similar was the observation when comparing the BS1012::SFC01 strain with 
its E. coli counter part (30%). Although the E. coli cells have 1 8 times more carotenoid genes, the carotenoid production 
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is only a factor of 2-3 times higher. More drastic was the difference observed in the carotenoid contents, between E. coli 
cells carrying the pZea4 construct in about 200 copies and the E. coli carrying the plasmid pXI12-ZYIB- 
EINV4MUTRBS2C in 18 copies. The first transformant produced 48x more zeaxanthin than the latter one. This differ- 
ence seen can not only be attributed to the roughly 1 1 times more carotenoid biosynthesis genes present in these trans- 

5 formants. Contributing to this difference is probably also the suboptimal performance of the newly constructed SFCO, 
in which the overlapping genes of the wild type Flavobacterium operon were separated to introduce the synthetic 
RBS's. This could have resulted in a lower translation efficiency of the rebuild synthetic operon (e.g. due to elimination 
of putative translational coupling effects, present in the wild type operon). 

In order to increase the carotenoid production, two new constructs were made, pXH2-ZYIB- 

w EINV4MUTRBS2CNEO and pXI12-ZYIB-EINV4 MUTRBS2CCAT, which after the integration of the SFCO into the 
levan-sucrase site of the chromosome, generate strains with an amplifiable structure as described by [Janniere et al., 
Gene 40, 47-55 (1985)]. Plasmid pXI12-ZYIB-EINV4MUTRBS2CNEO has been deposited on May 25. 1995 at the 
DSM-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (Germany) under accession No. DSM 1 001 3. 
Such amplifiable structures, when linked to a resistance marker (e.g chloramphenicol, neomycin, tetracycline), can be 

is amplified to 20-50 copies per chromosome. The amplifiable structure consist of the SFCO, the resistance gene and the 
pXI12 sequence, flanked by direct repeats of the sac-B 3' gene (see figure 22). New strains having elevated numbers 
of the SFCO could now be obtained by selecting for transformants with increased level of resistance to the antibiotic. 
To construct plasmid pXI12-ZYIB-EINV4MUTRBS2CNEO, the neomycin resistance gene was isolated from plasmid 
pBEST501 with Pstl and Smal and subcloned into the Pstl and EcoO1091 sites of the pUCl8 vector. The resulting con- 

20 struct was named pUC18-Neo. To get the final construct, the Pmel - Aatll fragment of plasmid pXI12-ZYIB- 
EINV4MUTRBS2C was replaced with the Smal-Aatll fragment of pUC18-Neo, containing the neomycin resistance 
gene. Plasmid pXI12-ZYIB-EINV4MUTRBS2CCAT was obtained as follows: the chloramphenicol resistance gene of 
pC194 was isolated by PCR using the primer pair cat3 and cat4. The fragment was digested with EcoRI and Aatll and 
subcloned into the EcoRI and Aatll sites of pUC18. The resulting plasmid was named pUCl8-CAT. The final vector was 

25 obtained by replacing the Pmel-Aatll fragment of pXI12-ZYIB-EINV4MUTRBS2C with the EcoRI-Aatll fragment of 
pUC18-CAT, carrying the chloramphenicol resistance gene. Figure 23 summarizes the different steps to obtain afore- 
mentioned constructs. Both plasmids were transformed into B. subtilis strain 1012, and transformants resulting from a 
Campbell-type integration were selected. Two strains BS1012-.SFCONEO1 and BS1012.:SFCOCAT1 were chosen for 
further amplification. Individual colonies of both strains were independently amplified by growing them in different con- 

30 centrations of antibiotics as described in the methods section. For the cat gene carrying strain, the chloramphenicol 
concentrations were 60, 80, 120 and 150 mg/ml. For the neo gene carrying strain, the neomycin concentrations were 
160 and 180 mg/ml. In both strains only strains with minor amplifications of the SFCO's were obtained. In daughter 
strains generated from strain BS1012::SFCONE01, the resistance to higher neomycin concentrations correlated with 
the increase in the number of SFCO's in the chromosome and with higher levels of carotenoids produced by these cells. 

35 A different result was obtained with daughter strains obtained from strain BS1012::SFCOCAT1. In these strains an 
increase up to 150 mg chloramphenicol/ml resulted, as expected, in a higher number of SFCO copies in the chromo- 
some. 

Example 7 

40 

Construction of CrtW containing plasmids and use for carotenoid production 

Polymerase chain reaction based gene synthesis. The nucleotide sequence of the artificial crtW gene, encoding 
the p-carotene p-4-oxygenase of Alcaligenes strain PC-1 , was obtained by back translating the amino acid sequence 

45 outlined in [Misawa, 1 995], using the BackTranslate program of the GCG Wisconsin Sequence Analysis Package, Ver- 
sion 8.0 (Genetics Computer Group, Madison, Wl, USA) and a codon frequency reference table of E. coli (supplied by 
the Bach Translate Program). The synthetic gene consisting of 726 nucleotides was constructed basically according to 
the method described by [Ye, 1992]. The sequence of the ^oligonucleotides (crtW1 - crtW12) required for the synthe- 
sis are shown in Figure 25. Briefly, the long oligonucleotides were designed to have short overlaps of 1 5-20 bases, serv- 

50 ing as primers for the extension of the oligonucleotides. After four cycles a few copies of the full length gene should be 
present which is then amplified by the two terminal oligonucleotides crtW15 and crtW26. The sequences for these two 
short oligonucleotides are for the forward primer crtWl5 (5'-TATATCTAGAcat atgTCCGGTCGTAAA CCGG -3') and for 
the reverse primer crtW26 (5'-TATAgaattccacgtgTCA AGCACGACCACCGGTTTTAC G -3'). where the sequences 
matching the DNA templates are underlined. Small cap letters show the introduced restriction sites (A/del for the for- 

55 ward primer and EcoRI and Pml\ for the reverse primer) for the latter cloning into the pALTER-Ex2 expression vector. 

Polymerase chain reaction. All twelve long oligonucleotides (crtW1-crtWl2; 7 nM each) and both terminal primers 
(crtW15 and crtW26; 0.1 mM each) were mixed and added to a PCR reaction mix containing Expand™ High Fidelity 
polymerase (Boehringer. Mannheim) (3.5 units) and dNTP's (100 mM each). The PCR reaction was run for 30 cycles 
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Kit 01 vTs a CAUSA) The fragment was subsequently cloned into the Sma\ site of plasmd pUC18, using the 
Uppsaia 9 Sweden). The sequence of the resulting crtW £££ 
seauencing with the Sequenase Kit Version 1 .0 (United States Biochemical, Cleveland, OH, USA). The crtW gen s con 
stS by this methods found to contain minor errors, which were subsequently corrected by site-directed muta- 



genesis. 



Construction 0 /p/a Sm /d S . P.asmidpB..KS( + )-CARVEG-E (see also Examples ^^^^^X 
biosynthesis genes (crtE- crtB. crtY, crtl and crtZ) of the Gram (-) bacterium Flavobactenum sp. stra n P I 534 WT 
S^^rPasamontss 1995 #732] cloned into a mcxWied pBluescript II KS( + ) vector (Stratagene. LaJolla. USA) 
l^rvfn ST of theTsu^s veg promoter [LeGrice, 1986 #806]. This constitutive promoter has been shown to be 
fuSain^c^^^ 

Plasmid ^TCP^BaSw «as constructed by cloning the Nde\ - EcoRI restricted fragment of the synthetic crtW gene 
^^SSXSb of plasmid pALTER-Ex2 (Promega. Madison, W.). Plasmid P ALTER-Ex2 ,s a low -copy -pto- 
S^TSSSnrt replication .which allows it to be maintained with ColE1 vectors ,n the ^^ p ^ 
pBIIKS-crtEB^YZW (Figure 26) was obtained by cloning the HindUVPm* fragment of ^^^^^^ 

S^gl^SSS meting the Nde. - Hpa. fragment in plasmid PBHKS-crtEB.YZW as oufcrjd *oj. PJ- 
IT a.tco i=vo rrtFRiYrri7Wl and nALTE R-Ex2-crtEBIYZ[DW] , were obtained by isolating the BamHl-ADai 

Xhsl sites of nALTER-Ex2 The plasmid pBIIKS-crtW was constructed by digesting pBIIKS-crtEBIYZW with nsi\ and 
Sac! plasl after recessing the DNA overhangs with Klenow enzyme. Figure 27 conpites the 

in Luria-Broth medium supplemented with antibiotics (ampicillin 100 mg/ml. tetracycl.n 12.5 mg/ml) in shake i lasks at 
37°C and 22oTpm Carotenoids were extracted from the ce.ls with acetone. The acetone was removed in vacuo and 
fhe reside was nTdisLved in toluene. The coloured solutions were subjected to high-performance l^d chrDmalog- 
n^E^Ss^ was performed on a Hewlett-Packard series 1050 instrument. The carotenoids were sep- 
arate o a ^ Silica cl™ NucleosH Si - 100. 200 x 4 mm, 3m. The solvent system included two solvents, hexane (A) 
Hlln^HF viTm A linear qradient was applied running from 13 to 50 % (B) within 15 minutes. The flow rate 
Is 1 5 Peaks were ^SJSo^a photo diie array detector. The individual carotenoid pigments 

: ei^ 

nure carotenoids prepared by chemical synthesis and characterised by NMR, MS and UV-Spectra^HPLC analysis , ot 
tt ^pSSSSaSSL Ecoli cells transformed with plasmid pBIIKS-crtEBIYZW, carrying besdes the ctenoid 
SiSS^eTol F,avo b acter,um sp. strain R1534. also the crtW gene encoding the P<^ e J**^ 
Sen^sPC *1 [Misawa, 1995 #670] gave the following major peaks identfied as: b-cryptoxanth,n, a^ntf.^ate- 
n ^^xan?hin and zeaxanthin based on the retention times and on the comparison of the absorbance spectra to given 
eenc sample o che m caliy pure carotenoids. The re.ative amount (area percent) of the ^^J^"??. 
r c oTtran^mant carrying pBIIKS-crtEBIYZW is shown in Table 3 ["CRX": cryptoxanthin, '^SX : astaxanthin, ADX . 
Lon JS?™- zei«n!iin; "ECM": echinenone; "MECH": 3-hydroxyechinenone, "CXN": cantaxanthin]. The L of 
th Z^TeJo^ SSSed carotenoids was defined as 100%. Numbers shown in Table 3 

cultures for each transformant. In contrast ^ 1 ^^SS^!^^^ ^ a 

echinenone, hydroxyechinenone and 
narPd to the transformants carrying all the crt genes on the same plasmid (pBIIKS-crtEBIYZDW). Plasmid puiirv& cne 
SS^nS^S^Sri carrying the functional genes of crtE. crtB, crfY. crt,. crtZ of 
R 153 and truncation-functional version of the crtW gene, whereas the functiona copy * 

and the ,ow copy construct pALTER-Ex2-crtEBIYZ[DW], ^^^^ ^^ ^^^^' U ^^. 
genes Pigment analysis of these transformants by HPLC monitored the presence of ^^"^ 
anthin adonixanthin, zeaxanthin, 3-hydroxyechine-none and minute traces of ech.nenone and canthaxanthm (Table 3). 
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Transformants harbouring the crtW gene on the low copy plasmid p ALTE R-Ex2-crtW and the genes crtE, crtB, crtY 
and crtl on the high copy plasmid pBIIKS-crtEBIY[DZW] expressed only minor amounts of canthaxanthin (6 %) but high 
levels of echinenone (94%), whereas cells carrying the crtW gene on the high copy plasmid pBIIKS-crtW and the other 
crt genes on the low copy construct pALTER-Ex2-crtEBIY[DZW], had 78.6% and 21.4 % of echinenone and canthax- 
5 anthin, respectively (Table 3). 



Table 3 



plasmids 


CRX 


ASX 


ADX 


ZXN 


ECH 


HECH 


CXN 


pBIIKS-crtEBIYZW 


1.1 


2.0 


44.2 


52.4 


< 1 


< 1 


< 1 


pBIIKS-crtEBIYZ[AW] + pALTER-Ex2-crtW 


2.2 




25.4 


72.4 


< 1 


< 1 


< 1 


pBIIKS-crtEBIY[AZ]W 










66.5 




33.5 


pBIIKS-crtEBIYTAZW] + pBIIKS-crtW 










94 




6 



Example 8 

20 Selective carotenoid production by using the crtW and crtZ genes of the Gram negative bacterium E-396 . 

In this section we describe E. coli transformants which accumulate only one (canthaxanthin) or two main caroten- 
oids (astaxanthin, adonixanthin) and minor amounts of adonirubin, rather than the complex variety of carotenoids seen 
in most carotenoid producing bacteria [Yokoyama et al., Biosci. Biotechnol. Biochem. 58:1842-1844 (1994)] and some 

25 of the E. coli transformants shown in Table 3. The ability to construct strains producing only one carotenoid is a major 
step towards a successful biotechnological carotenoid production process. This increase in the accumulation of individ- 
ual carotenoids accompanied by a decrease of the intermediates, was obtained by replacing the crtZ of Flavobacterium 
R1 534 and/or the synthetic crtW gene (see example 5) by their homologous genes originating from the astaxanthin pro- 
ducing Gram negative bacterium E-396 (FERM BP-4283) [Tsubokura et al., EP-application 0 635 576 A1]. Both genes, 

30 crtW E 396 and crtZ E396 , were isolated and used to construct new plasmids as outlined below. 

Isolation of a putative fragment of the crtWgene of strain E-396 by the polymerase chain reaction. Based on pro- 
tein sequence comparison of the crtW enzymes of Agrobacterium aurantiacum, Alcaligenes PC-1 (WO95/18220) [Mis- 
awa et al.. J.Bacteriol. 177: 6575-6584 (1995)] and Haematococcus pluvialis [Kajiwara et al., Plant Mol. Biol. 29:343- 
352 (l995)][Lotan et al., FEBS letters, 364:125-128 (1995)], two regions named I and II, having high amino acid con- 

35 servation and located approx. 140 amino acids appart, were identified and chosen to design the degenerate PCR prim- 
ers shown below. The N-terminal peptide HDAMHG (region I) was used to design the two 1 7-mer degenerate primer 
sequences crtWIOO and crtW101 : 

crtW1 00: 5•-CA(C/T)GA(C/T)GC(A/C)ATGCA(C/T)GG-3• 

40 

crtW101 : 5'-CA(C/T)GA(C/T)GC(GniATGCA(C^T)GG-3' 

The C-terminal peptide H(W/H)EHH(R/L) corresponding to region II was used design the two 1 7-mer degenerate 
primer with the antisense sequences crtW105 and crtW106: 

45 

crtW1 05: 5'-AG(G/A)TG(G/A)TG(T/C)TC(G/A)TG(G/A)TG-3' 
crtW1 06: 5 , -AG(G/A)TG(G/A)TG(T/C)TCCCA(G7A)TG-3' 

50 Polymerase chain reaction. PCR was performed using the GeneAmp Kit (Perkin Elmer Cetus) according to the 
manufacturer's instructions. The different PCR reactions contained combinations of the degenerate primers 
(crtW100/crtW105 or crtW100/crtW106 or crtW101/crtW105 or crtW101/crtW106) at a final concentration of 50 pM 
each, together with genomic DNA of the bacterium E-396 (200 ng) and 2.5 units of Taq polymerase. In total 35 cycles 
of PCR were performed with the following cycle profile: 95 °C for 30 sec, 55 °C for 30 sec, 72 °C for 30 sec. PCR reac- 

55 tions made with the following primer combinations crtW100/crtW105 and crtW101/crtW105 gave PCR amplification 
products of approx. 500 bp which w re in accordance with the expected fragment size. The 500 bp fragment, 
JAPclone8, obtained in the PCR reaction using primers crtW101 and crtW105 was excised from an 1.5% agarose gel 
and purified using the GENECLEAN Kit and subsequently cloned into the Smal site of pUC1 8 using the Sure-Clone Kit, 
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ac«*g,o*e m — e,^ 

saw^" 73 HSr 2201 ^ * idenwy at ■" nucteofde 

binations of restrictions enzymes and separated by «f^f ^f^ as nybridise d with a 32 P labelled 334 bp 
ments by Southern blotting onto a " rt ™^ BssHII and M/ul. An approx. 9.4kb 

fragment obtained by digesting the aforementioned PC ^^™^ appropiate for cloning since it is long enough 
EcoRI/eamHI fragment hybridizing to the p^ . ptothe EcoR1 and BamH l sites of 

to potentially carry the complete ^cluster The f sequence of the PGR fragment JAPcloneS, 
pBWscriptllKS resulting in plasm,d pJAPCL544 (Rg. 2fl ) Jasec ' ™J£^ hand of this fragment . Fig . 30 shows 
L primers were synthesized to obtain more sequence ,, ^^^^J 1 ^ t z^ (from nucleotide 765 to 1253) 
me sequence obtained containing the crtW E396 (from nucleot d 40to 768) and ^crtZgjel ^ 
genesof the bacterium E-396. The nucleotide sequence t^^™^"^* Fig . 33 and the correspond- 
, fminoacidsequenceinFig^.Thenu^ 

98 ^S^: Both 

o isolated by PGR using primer crtW107 ^J*"™^ E ^™ H J subS( £ uent cloning steps (see section 
nheim. according to the man ^^ contains an artificial A/del site (under- 

below) the primer crt107 F-HGBW&Xto^^ prime r crtW108 (S'-ATCICQAST- 

lined sequence) spanning the ATG start codon of "^^J^^ w!n«> j ust downstream of the TGA stop 
CACGTGCGCTCCTGCGCCTCGGCC-3 ) has an Xho\ ate 0*^"* ' ^ » ic DNA of the bacte- 

> 5 codon of the crtZ E396 gene. The final PCR reac .on m.x had 10 pM of each ^primer 9 9 ^ ^ fo||owing 

5 rW-396and3.5unitsofthe ? qDNA^ 

cycle profile: 95 °C. 1 min; 60 °C, 1 m.n; 72 °C 1m n 30 sec ^^he P^prc^uci £ Sure-Clone Kit. 

k agarose gel ^ pur.ed using 

The resulting construct was named P UC18 " E396 ™ with A/del and Xho\ and 

so lows. The crtW E396 and crtZ E396 gene were > ^™ in plasmid P BI.KS-crtEBIY[E396WZ] (Fig. 

thin ^i^ 

°C, 20 sec) 

primer crtW1 13 (5'-ATATACATATGGTGTCCCCCTTGGTGCGGGTGC-3') 
" primer crtW1 1 4 (5'-TATGGATCCGACGCGTTCCCGGACCGCCACAATGC-3') 

TheresuKingtSObpfragme^^ 
iSKM-PCRRBScrtZ resting in , the ^^^^^^lu^ non-functional crtZ gene of 
« crtB. crtl. crtY of Ftewbartentrm. he crt J* ^f^, pBll SK( + ).PCRBBScrtZ-2 and clonihg it, 
Ravofcacfcr/OTwasob^ 

rar=r.« 
50 «^^rs»^»« £ «-* 



plasmid 



pBIIKScrtEBIYZW 



P BIIKS-crtEBIY[E396WZ] 
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Table 4 (continued) 



plasmid 


CRX 


ASX 


ADX 


ZXN 


ECH 


HECH 


CXN 


BCA 


ADR 


pBI IKS-crtEB IY[E396W]AZ 














100 







The results of E. coli transformants carrying pBIIKScrtEBIYZW (see example 7) are also shown in Table 4 to indicate 
the dramatic effect of the new genes crtW E39S and crtZ E396 on the carotenoids produced in these new transformants. 

Example 9 

w 

Cloning of the remaining crt genes of the Gram negative bacterium E-396. 

TGI E. coli transformants carrying the pJAPCL544 plasmid did not produce detectable quantities of carotenoids 
(results not shown). Sequence analysis and comparison of the 3' (BamH\ site) of the insert of plasmid pJAPCL544, to 

is the crt cluster of Flavobacterium R1534 showed that only part of the C-terminus of the crtE gene was present. This 
result explained the lack of carotenoid production in the aforementioned transformants. To isolate the missing N-termi- 
nal part of the gene, genomic DNA of E-396 was digested by 6 restrictions enzymes in different combinations : EcoRI, 
BamHI, Psfl, Sacl, Spnl and Xbal and transferred by the Southern blot technique to nitrocellulose. Hybridization of this 
membrane with the 32 P radio-labelled probe (a 463 bp Pst\-BamH\ fragment originating from the 3' end of the insert of 

20 pJAPCL544 (Fig. 29) highlighted a -1300 bp-long Pst\-Pst\ fragment. This fragment was isolated and cloned into the 
Psfl site of pBSIIKS(+) resulting in plasmid pBSIIKS-#1296. The sequence of the insert is shown in Fig. 38 (small cap 
letters refer to new sequence obtained. Capital letters show the sequence also present in the 3' of the insert of plasmid 
pJAPCL544). The complete crtE gene has therefore a length of 882 bp (see Fig. 39) and encodes a GGPP synthase 
of 294 amino acids (Fig. 40). The crtE enzyme has 38 % identity with the crtE amino acid sequence of Erwinia herbicola 

25 and 66 % with Flavobacterium R1534 WT. 

Construction of plasmids. To have a plasmid carrying the complete crt cluster of E-396, the 4.7 kb MluVBamHl 
fragment encoding the genes crtW, crtZ, crtY, crtl and crtB was isolated from pJAPCL544 and cloned into the 
Mlu\IBamH\ sites of pUC18-E396crtWZPCR (see example 8). The new construct was named pE396CARcrtW-B (Fig. 
41) and lacked the N-terminus of the crtE gene. The missing C-terminal part of the crtE gene was then introduced by 

30 ligation of the aforementioned Ps/I fragment of pBIIKS-#1 296 between the Psrl sites of pE396CARcrtW-B. The result- 
ing plasmid was named pE396CARcrtW-E (Fig, 41). The carotenoid distribution of the E. coli transformants carrying 
aforementioned plasmid were: adonixanthin (65%), astaxanthin (8%) and zeaxanthin (3%). The % indicated reflects the 
proportion of the total amount of carotenoid produced in the cell. 

35 Example 10 

Astaxanthin and adonixanthin production in Flavobacterium R1534 

Among bacteria Flavobacterium may represent the best source for the development of a fermentative production 
40 process for 3P, 3R' zeaxanthin. Derivatives of Flavobacterium sp. strain R1534, obtained by classical mutagenesis 
have attracted in the past two decades wide interest for the development of a large scale fermentative production of 
zeaxanthin, although with little success. Cloning of the carotenoid biosynthesis genes of this organism, as outlined in 
example 2, may allow replacement of the classical mutagenesis approach by a more rational one, using molecular tools 
to amplify the copy number of relevant genes, deregulate their expression and eliminate bottlenecks in the carotenoid 
45 biosynthesis pathway. Furthermore, the introduction of additional heterologous genes (e.g. crtW) will result in the pro- 
duction of carotenoids normally not synthesised by this bacterium (astaxanthin, adonirubin, adonixanthin, canthaxan- 
thin, echinenone). The construction of such recombinant Flavobacterium R1534 strains producing astaxanthin and 
adonixanthin will be outlined below. 

so Gene transfer into Flavobacterium sp. 

Plasmid transfer by conjugative mobilization. For the conjugation^ crosses we constructed plasmid pRSFl010- 
Amp r , a derivative of the small (8.9 kb) broad host range plasmid RSF1010 (IncQ incompatibility group) [Guerry et al., 
J. Bacteriol. 1 17:619-630 (1974)] and used E. coli S17-1 as the mobilizing strain [Priefer et al., J. Bacterid. 163:324- 
55 330 (1985)]. In general any of the IncQ plasmids (e.g. RSF1010, R300B, R1162) may be mobilized into rifampicin 
resistant Flavobacterium // the transfer functions are provided by plasmids of the IncPl group (e.g. R1, R751). 

Rifampicin resistant (Rif) Flavobacterium R1534 cells were obtained by selection on 100 mg rifampicin/ml. One 
resistant colony was picked and a stock culture was made. The conjugation protocol was as follows: 
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Day 1: 

- grow 3 ml culture of Flavobacterium R1534 Rif for 24 hours at 30 °C in Flavobacter medium (F-medium) (see 
example 1) 

5 - grow 3 ml mobilizing £ coli strain carrying the mobilizable plasmid O/N at 37 °C in LB medium, (e.g E. coli S1 7-1 
carrying pRSF1 010- Amp r or E. coliTG-1 cells carrying R751 andpRSF1010-Amp r ) 

Day 2: 

w - pellet 1 ml of the Flavobacterium R1534 Rif cells and resuspend in 1ml of fresh F-medium. 

- pellet 1 ml of E. coli cells (see above) and resuspend in 1 ml of LB medium. 

- donor and recipient cells are then mixed in a ratio of 1:1 and 1: 10 in an Eppendorf tube and 30 ml are then applied 
15 onto a nitrocellulose filter plated on agar plates containing F-medium and incubated O/N at 30°C. 

Day 3: 

- the conjugation^ mixtures were washed off with F-medium and plated on F-medium containing 100 mg rifampicin 
20 and 1 00 mg ampicillin/ml for selection of transconjugants and inhibition of the donor cells. 

Day 6-8: 

Arising clones are plated once more on F-medium containing 100 mgRif and 100 mg Amp/ml before analysis. 

25 

Plasmid transfer by electroporation. The protocol for the eletroporation is as follows: 

1. add 10 ml of O/N culture of Flavobacterium sp. R1534 into 500 ml F-medium and incubate at 30°C until 
OD600=0.8-0.1 

30 

2. harvest cells by centrifugation at 4000g at 4°C for 10 min. 

3. wash cells in equal volume of ice-cold deionized water (2 times) 
35 4. resuspend bacterial pellet in 1 ml ice-cold deionized water 

5. take 50 ml of cells for electroporation with 0.1 mg of plasmid DNA 

6. electroporation was done using field strengths between 1 5 and 25 kV/cm and 1 -3 ms. 

7. after electroporation cells were immediately diluted in 1 ml of F-medium and incubated for 2 hours at 30°C at 180 
rpm before plating on F-medium plates containing the respective selective antibioticum. 

Plasmid constructions: Plasmid pRSF101-Amp r was obtained by cloning the Amp r gene of pBR322 between the 
45 EcoR\INot\ sites of RSF1010. The Amp r gene originates from pBR322 and was isolated by PCR using primers AmpR1 
and AmpR2 as shown in Fig. 42. 

AmpR1: 

5'-TATAT CGGCCGACTAGTAAGCTT CAAAAAGGATCTTCACCTAG-3' the underlined sequence contains the intro- 
50 duced restriction sites for EagV Spel and Hind\\\ to facilitate subsequent constructions. 

AmpR2: 

5'-ATAT GAATTC AATAATATTGAAAAAGGAAG-3' the underlined sequence corresponds to an introduced EcoR\ 
restriction site to facilitate cloning into RSF1 01 0 (see Fig. 42). 

55 

The PCR reaction mix had 10 pM of each primer (AmpR1/AmpR2), 0.5 mg plasmid pBR322 and 3.5 units of the 
TaqDNA/Pwo DNA polymerase mix. In total 35 amplification cycles were made with the profile: 95 °C, 45 sec; 59 °C, 45 
sec, 72 °C, 1 min. The PCR product of approx. 950 was extracted once with phenol/chloroform and precipitated with 0.3 
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M NaAcetate and 2 vol. Ethanol. The pellet was resuspended in H 2 0 and digested with EcoR\ and Eag\ O/N. The 
digestion was separated by electrophoresis and the fragment isolated from the 1% agarose gel and purified using 
GENECLEAN before ligation into the EcoR\ and Not) sites of RSF1010. The resulting plasmid was named pRSF1010- 
Amp r (Fig. 42). 

5 Plasmid RSF1010-Ampr-crt1 was obtained by isolating the HindWNotl fragment of pBIIKS-crtEBIY[E396WZ] and 

cloning it between the H/ndlll/EagI sites of RSF1010-Amp r (Fig. 43). The resulting plasmid RSF1010-Ampr-crt1 carries 
crtW E396 , crtZ E396 , crtY genes and the N-terminus of the crtl gene (non-functional). Plasmid RSF1010-Ampr-crt2 car- 
rying a complete crt cluster composed of the genes crtW E39S and crtZ E396 of E-396 and the crtY, crtl, crtB and crtE of 
Flavobacterium R1 534 was obtained by isolating the large Hind\\\IXba\ fragment of pBIIKS-crtEBIY[E396WZ] and clon- 

10 ing it into the SpeUHindW sites of RSF1010-Amp r (Fig. 43). 

Flavobacterium R1534 transformants carrying either plasmid RSF1010-Amp r , Plasmid RSFl010-Amp r -crt1 or 
Plasmid RSF1010-Amp r -crt2 were obtained by conjugation as outlined above using E. coli S17-1 as mobilizing strain. 

Comparison of the carotenoid production of two Flavobacterium transformants. Overnight cultures of the individual 
transformants were diluted into 20 ml fresh F-medium to have a final starting OD600 of 0.4. Cells were harvested after 

T5 growing for 48 hours at 30 °C and carotenoid contents were analysed as outlined in example 7. Table 5 shows the result 
of the three control cultures Flavobacterium [R1534 WT], [R1534 WT RifR] (rifampicin resistant) and [R1534WT Rifr 
RSF1010-AmpR] (carries the RSF1010-Amp r plasmid) and the two transformants [R1534 WT RSF1010-AmpR-crt1] 
and [R1534 WT RSFl010-AmpR-crt2]. Both latter transformants are able to synthesise astaxanthin and adonixanthin 
but little zeaxanthin. Most interesting is the [R1534 WT RSF1010-AmpR-crt2] Flavobacterium transformant which pro- 

20 duces approx. 4 times more carotenoids than the R1 534 WT. This increase in total carotenoid production is most likely 
due to the increase of the number of carotenoid biosynthesis clusters present in these cell (e.g. corresponds to the total 
copy number of plasmids in the cell). 
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Table 5 






Transformant 


carotenoids % of total dry 
weight 


total carotenoid con- 
tent in % of dry weight 




R1534 WT 


0.039% p-Carotin 


0.06% 


30 




0.001% p-Cryptoxanthin 
0.018% Zeaxanthin 






R1534 Rif 


0.036% p-Carotin 


0.06% 


35 




0.002% p-Cryptoxanthin 






0.022% Zeaxanthin 




40 


R1534 Rif [RSF1010-Ampr] 


0.021% p-Carotin 
0.002% p-Cryptoxanthin 
0.032% Zeaxanthin 


0.065% 




R1534 Rif[RSF1010-Ampr-crt1] 


0.022% Astaxanthin 
0.075% Adonixanthin 


0.1% 


45 




0.004% Zeaxanthin 






R1534 Rif [RSF1010-Ampr-crt2] 


0.132% p-Carotin 
0.006% Echinenon 
0.004% Hydroxyechinenon 


0.235% 


50 




0.003% p-Cryptoxanthin 
0.044% Astaxanthin 
0.039% Adonixanthin 




55 




0.007% Zeaxanthin 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: F . HOFFMANN-LA ROCHE AG 

(B) STREET: GRENZACHERSTRASSE 124 

(C) CITY: BASLE 

(D) STATE: BS 

(E) COUNTRY: SWITZERLAND 

(F) POSTAL CODE (ZIP): CH - 4002 

(G) TELEPHONE: 061 - 688 2505 

(H) TELEFAX: 061 688 1395 

(I) TELEX: 962292/965542 hlr ch 

TITLE OF INVENTION: Improved fermentative carotenoid prod\ 
(iii) NUMBER OF SEQUENCES: 17 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk. 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 729 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: double 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE : DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

35 ATGAGCGCAC ATGCCCTGCC CAAGGCAGAT CTGACCGCCA CCAGTTTGAT CGTCTCGGGC 60 

GGCATCATCG CCGCGTGGCT GGCCCTGCAT GTGCATGCGC TGTGGTTTCT GGACGCGGCG 120 

GCGCATCCCA TCCTGGCGGT CGCGAATTTC CTGGGGCTGA CCTGGCTGTC GGTCGGTCTG 180 

TTCATCATCG CGCATGACGC GATGCATGGG TCGGTCGTGC CGGGGCGCCC GCGCGCCAAT 240 

40 

GCGGCGATGG GCCAGCTTGT CCTGTGGCTG TATGCCGGAT TTTCCTGGCG CAAGATGATC 300 

GTCAAGCACA TGGCCCATCA TCGCCATGCC GGAACCGACG ACGACCCAGA TTTCGACCAT 360 

GGCGGCCCGG TCCGCTGGTA CGCCCGCTTC ATCGGCACCT ATTTCGGCTG GCGCGAGGGG 420 

45 CTGCTGCTGC CCGTCATCGT GACGGTCTAT GCGCTGATGT TGGGGGATCG CTGGATGTAC 480 

GTGGTCTTCT GGCCGTTGCC GTCGATCCTG GCGTCGATCC AGCTGTTCGT GTTCGGCATC 540 

TGGCTGCCGC ACCGCCCCGG CCACGACGCG TTCCCGGACC GCCACAATGC GCGGTCGTCG 600 

so CGGATCAGCG ACCCCGTGTC GCTGCTGACC TGCTTTCACT TTGGCGGTTA TCATCACGAA. 660 

CACCACCTGC ACCCGACGGT GCCTTGGTGG CGCCTGCCCA GCACCCGCAC CAAGGGGGAC 720 

ACCGCATGA 729 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 242 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ser Ala His Ala Leu Pro Lys Ala Asp Leu Thr Ala Thr Ser Leu 
15 10 15 

lie Val Ser Gly Gly He He Ala Ala Trp Leu Ala Leu His Val His 
20 25 30 

Ala Leu Trp Phe Leu Asp Ala Ala Ala His Pro He Leu Ala Val Ala 
35 40 45 

Asn Phe Leu Gly Leu Thr Trp Leu Ser Val Gly Leu Phe He He Ala 
50 55 60 

His Asp Ala Met His Gly Ser Val Val Pro Gly Arg Pro Arg Ala Asn 
65 70 75 80 

Ala Ala Met Gly Gin Leu Val Leu Trp Leu Tyr Ala Gly Phe Ser Trp 
85 90 95 



Arg Phe He Gly Thr Tyr Phe Gly Trp Arg Glu Gly Leu Leu Leu Pro 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 486 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 

<xi> SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
ATGAC C AATT TCCTGATCGT CGTCGCCACC GTGCTGGTGA TGGAGCTGAC GGCCTATTCC 
GTCCACCGCT GGATCATGCA CGGCCCCTTG GGCTGGGGCT GGCACAAGTC CCACCACGAG 
GAACACGACC ACGCGCTGGA AAAGAACGAC CTGTACGGCC TGGTCTTTGC GGTGATCGCC 
ACGGTGCTGT TCACGGTGGG CTGGATCTGG GCACCGGTCC TGTGGTGGAT CGCCTTGGGC 
ATGACCGTCT ACGGGCTGAT CTATTTCGTC CTGCATGACG GGCTGGTGCA TCAGCGCTGG 
CCGTTCCGCT ATATCCCTCG CAAGGGCTAT GCCAGACGCC TGTATCAGGC CCACCGCCTG 
CACCACGCGG TCGAGGGGCG CGACCATTGC GTCAGCTTCG GCTTCATCTA TGCGCCGCCG 
GTCGACAAGC TGAAGCAGGA CCTGAAGACG TCGGGCGTGC TGCGGGCCGA GGCGCAGGAG 
CGCACG 

(2) INFORMATION FOR SEQ ID NO : 4: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 162 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(XX) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Thr Asn Phe Leu He Val Val Ala Thr Val Leu Val Met Glu Leu 
15 10 15 

Thr Ala Tyr Ser Val His Arg Trp He Met His Gly Pro Leu Gly Trp 
20 25 30 

Gly Trp His Lys Ser His His Glu Glu His Asp His Ala Leu Glu Lys 
35 40 45 

Asn Asp Leu Tyr Gly Leu Val Phe Ala Val He Ala Thr Val Leu Phe 
50 55 60 

Thr Val Gly Trp He Trp Ala Pro Val Leu Trp Trp He Ala Leu Gly 
65 70 75 80 

Met Thr Val Tyr Gly Leu lie Tyr Phe Val Leu His Asp Gly Leu Val 
85 90 95 
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(2) INFORMATION FOR SEQ ID NO: 5: 

( i ) ■ SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 882 base pairs 

(B) TYPE: nucleic acid 
<C) STRAND EDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

ATGAGACGAG ACGTCAACCC GATCCACGCC ACCCTTCTGC AGACCAGACT TGAGGAGATC 60 

GCCCAGGGAT TCGGTGCCGT GTCGCAGCCG CTCGGCCCGG CCATGAGCCA TGGCGCGCTG 120 

TCGTCGGGCA AGCGTTTCCG CGGCATGCTG ATGCTGC TTG CGGCAGAAGC CTCGGGCGGG 180 

GTCTGCGACA CGATCGTCGA CGCCGCCTGC GCGGTCGAGA TGGTGCATGC CGCATCGCTG 240 

ATCTTCGACG ACCTGCCCTG C ATGGAC GAT GCCGGGCTGC GCCGCGGCCA GCCCGCGACC 300 

CATGTGGCGC ATGGCGAAAG CCGCGCCGTG CTAGGCGGCA TCGCCCTGAT CACCGAGGCG 360 

ATGGCCCTGC TGGCCGGTGC GCGCGGCGCG TCGGGCACGG TGCGGGCGCA GCTGGTGCGG 420 

ATCCTGTCGC GGTCCCTGGG GCCGCAGGGC CTGTGCGCCG GCCAGGACCT GGACCTGCAC 480 

GCGGCCAAGA ACGGCGCGGG GGTCGAACAG GAACAGGACC TGAAGACCGG CGTGCTGTTC 540 

ATCGCCGGGC TGGAGATGCT GGCCGTGATC AAGGAGTTCG ACGCCGAGGA GCAGACTCAG 600 

ATGATCGACT TTGGCCGTCA GCTGGGCCGG GTGTTCCAGT CCTATGACGA CCTGCTGGAC 660 

GTTGTGGGCG ACCAGGCGGC GCTTGGCAAG GATACCGGTC GCGATGCGGC GGCCCCCGGC 720 

CCGCGGCGCG GCCTTCTGGC CGTGTCAGAC CTGCAGAACG TGTCCCGTCA CTATGAGGCC 780 

AGCCGCGCCC AGCTGGACGC GATGCTGCGC AGCAAGCGCC TTCAGGCTCC GGAAATCGCG 840 

GCCCTGCTGG AACGGGTTCT GCCCTACGCC GCGCGCGCCT AG 882 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 293 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
{ D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Arg Arg Asp Val Asn Pro lie His Ala Tar Leu Leu Gin Thr Arg 
15 10 15 

Leu Glu Glu He Ala Gin Gly Phe Gly Ala Val Ser Gin Pro Leu Gly 
20 25 30 

Pro Ala Met Ser His Gly Ala Leu Ser Ser Gly Lys Arg Phe Arg Gly 
35 40 45 

Met Leu Met Leu Leu Ala Ala Glu Ala Ser Gly Gly Val Cys Asp Thr 
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50 



55 



60 



He Val Asp Ala Ala Cys Ala Val Glu Met; Val His Ala Ala Ser Leu 
65 70 75 80 

He Phe Asp Asp Leu Pro Cys Met Asp Asp Ala Gly Leu Arg Arg Gly 
85 90 95 

Gin Pro Ala Thr His Val Ala His Gly Glu Ser Arg Ala Val Leu Gly 
100 105 110 

Gly He Ala Leu He Thr Glu Ala Met Ala Leu Leu Ala Gly Ala Arg 
115 120 125 

Gly Ala Ser Gly Thr Val Arg Ala Gin Leu Val Arg He Leu Ser Arg 
130 135 140 

Ser Leu Gly Pro Gin Gly Leu Cys Ala Gly Gin Asp Leu Asp Leu His 
145 150 155 160 

Ala Ala Lys Asn Gly Ala Gly Val Glu Gin Glu Gin Asp Leu Lys Thr 
165 170 175 

Gly Val Leu Phe He Ala Gly Leu Glu Met Leu Ala Val He Lys Glu 
180 185 190 

Phe Asp Ala Glu Glu Gin Thr Gin Met He Asp Phe Gly Arg Gin Leu 
195 200 205 

Gly Arg Val Phe Gin Ser Tyr Asp Asp Leu Leu Asp Val Val Gly Asp 
210 215 220 

Gin Ala Ala Leu Gly Lys Asp Thr Gly Arg Asp Ala Ala Ala Pro Gly 
225 230 235 240 

Pro Arg Arg Gly Leu Leu Ala Val Ser Asp Leu Gin Asn Val Ser Arg 
245 250 255 

His Tyr Glu Ala Ser Arg Ala Gin Leu Asp Ala Met Leu Arg Ser Lys 
260 265 270 

Arg Leu Gin Ala Pro Glu He Ala Ala Leu Leu Glu Arg Val Leu Pro 
275 280 285 

Tyr Ala Ala Arg Ala 



INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 295 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met Thr Pro Lys Gin Gin Phe Pro Leu Arg Asp Leu Val Glu He Arg 
15 10 15 

Leu Ala Gin He Ser Gly Gin Phe Gly Val Val Ser Ala Pro Leu Gly 
20 25 30 

Ala Ala Met Ser Asp Ala Ala Leu Ser Pro Gly Lys Arg Phe Arg Ala 



290 



35 



40 



45' 



31 



EP0 872 554 A2 

Val Leu Met Leu Met Val Ala Glu Ser Ser Gly Gly Val Cys Asp Ala 
50 55 60 

Met Val Asp Ala Ala Cys Ala Val Glu Met Val His Ala Ala Ser Leu 
65 70 75 80 

He Phe Asp Asp Met Pro Cys Met Asp Asp Ala Arg Thr Arg Arg Gly 
85 90 95 

Gin Pro Ala Thr His Val Ala His Gly Glu Gly Arg Ala Val Leu Ala 
100 105 110 

Gly He Ala Leu lie Thr Glu Ala Met Arg He Leu Gly Glu Ala Arg 
115 120 125 

Gly Ala Thr Pro Asp Gin Arg Ala Arg Leu Val Ala Ser Met Ser Arg 
130 135 140 

Ala Met Gly Pro Val Gly Leu Cys Ala Gly Gin Asp Leu Asp Leu His 
145 150 155 160 

Ala Pro Lys Asp Ala Ala Gly He Glu Arg Glu Gin Asp Leu Lys Thr 
165 170 175 

Gly Val Leu Phe Val Ala Gly Leu Glu Met Leu Ser He lie Lys Gly 
180 185 190 

Leu Asp Lys Ala Glu Thr Glu Gin Leu Met Ala Phe Gly Arg Gin Leu 
195 200 205 

Gly Arg Val Phe Gin Ser Tvr Asd Asp Leu Leu Asp Val lie Gly Asp 
210 215 220 



His Asp He Arg Arg Ser Ala 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 888 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
ATGACGCCCA AGCAGCAATT CCCCCTACGC GATCTGGTCG AGATCAGGCT GGCGCAGATC 
TCGGGCCAGT TCGGCGTGGT CTCGGCCCCG CTCGGCGCGG CCATGAGCGA TGCCGCCCTG 
TCCCCCGGCA AACGCTTTCG CGCCGTGCTG ATGCTGATGG TCGCCGAAAG CTCGGGCGGG 
GTCTGCGATG CGATGGTCGA TGCCGCCTGC GCGGTCGAGA TGGTCCATGC CGCATCGCTG 
ATCTTCGACG ACATGCCCTG CATGGACGAT GCCAGGACCC GTCGCGGTCA GCCCGCCACC 
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ATGGC GAGGG 


GCGCGCGGTG 


CTTGCGGGCA 


TCGCCCTGAT CACCGAGGCC 












AGCGCGCAAG GCTGGTCGCA 






GCGCGATGGG 


ACCGGTGGGG 


CTGTGCGCAG 


GGCAGGATCT GGACCTGCAC 








GATCGAACGT 


GAACAGGACC 






GTC GC GGGCC 


TCGAGATGCT 


GTCCATTATT 


AAGGGTCTGG 


ACAAGGCCGA GACCGAGCAG 




CTCATGGCCT 


TCGGGCGTCA 


GCTTGGTCGG 


GTCTTCCAGT 


CCTATGACGA CCTGCTGGAC 




GTGATCGGCG 


ACAAGGCCAG 


CACCGGCAAG 


GATACGGCGC 


GCGACACCGC CGCCCCCGGC 


720 


CGAAAGGGGC- 


CCCTCATCCC 


GCTCGGACAG- A-TGGGGCSAGG- 


TGGCGCAGCA- TTACCGCGCC 


780. 


AGCCGCGCGC 


AACTGGACGA 


GCTGATGCGC 


ACCCGGCTGT 


TCCGCGGGGG GCAGATCGCC? 


840 


GACCTGCTGG 


CCCGCGTGCT 


GCCGCATGAC 


ATCCGCCGCA 


GCGCCTAG 


888 



(2) INFORMATION FOR SEQ ID NO : 9: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 303 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 



Met Thr Asp Leu Thr Ala Thr Ser Glu Ala Ala He Ala Gin Gly Ser 
15 10 15 

Gin Ser Phe Ala Gin Ala Ala Lys Leu Met Pro Pro Gly He Arg Glu 
20 25 30 

Asp Thr Val Met Leu Tyr Ala Trp Cys Arg His Ala Asp Asp Val He 
35 40 45 

Asp Gly Gin Val Met Gly Ser Ala Pro Glu Ala Gly Gly Asp Pro Gin 
50 55 60 

Ala Arg Leu Gly Ala Leu Arg Ala Asp Thr Leu Ala Ala Leu His Glu 
65 70 75 80 

Asp Gly Pro Met Ser Pro Pro Phe Ala Ala Leu Arg Gin Val Ala Arg 
85 90 95 

Arg His Asp Phe Pro Asp Leu Trp Pro Met Asp Leu He Glu Gly Phe 
100 105 110 

Ala Met Asp Val Ala Asp Arg Glu Tyr Arg Ser Leu Asp Asp Val Leu 
115 120 125 

Glu Tyr Ser Tyr His Val Ala Gly Val Val Gly Val Met Met Ala Arg 
130 135 140 

Val Met Gly Val Gin Asp Asp Ala Val Leu Asp Arg Ala Cys Asp Leu 
145 150 155 160 

Gly Leu Ala Phe Gin Leu Thr Asn He Ala Arg Asp Val He Asp Asp 
165 170 175 

Ala Ala He Gly Arg Cys Tyr Leu Pro Ala Asp Trp Leu Ala Glu Ala 
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(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 908 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ATGACCGATC TGACGGCGAC TTCCGAAGCG GCCATCGCGC AGGGTTCGCA AAGCTTCGCG 60 

CAGGCGGCCA AGCTGATGCC GCCCGGCATC CGCGAGGATA CGGTCATGCT CTATGCCTGG 120 

TGCAGGCATG CGGATGACGT GATCGACGGG CAGGTGATGG GTTCTGCCCC CGAGGCGGGC 180 

GGCGACCCAC AGGCGCGGCT GGGGGCGCTG CGCGCCGACA CGCTGGCCGC GCTGCACGAG 240 

GACGGCCCGA TGTCGCCGCC CTTCGCGGCG CTGCGCCAGG TCGCCCGGCG GCATGATTTC 300 

CCGGACCTTT GGCCGATGGA CCTGATCGAG GGTTTCGCGA TGGATGTCGC GGATCGCGAA 360 

TACCGCAGCC TGGATGACGT GCTGGAATAT TCCTACCACG TCGCGGGGGT CGTGGGCGTG 420 

ATGATGGCGC GGGTGATGGG CGTGCAGGAC GATGCGGTGC TGGATCGCGC CTGCGATCTG 480 

GGCCTTGCGT TCCAGCTGAC GAACATCGCT CGCGACGTGA TCGACGATGC CGCCATCGGG 540 

CGCTGCTATC TGCCTGCCGA CTGGCTGGCC GAGGCGGGGG CGACGGTTGA GGGTCCGGTG 600 

CCTTCGGACG CGCTCTATTC CGTCATCATC CGCCTGCTTG ACGCGGCCGA GCCCTATTAT 660 

GCCTCGGCGC GGCAGGGGCT TCCGCATCTG CCGCCGCGCT GCGCGTGGTC GATCGCCGCC 720 

GCGCTGCGTA TCTATCGCGC AATCGGGACG CGCATCCGGC AGGGTGGCCC CGAGGCCTAT 780 

CGCCAGCGGA TCAGCACGTC GAAGGCTGCC AAGATCGGGC TTCTGGCGCG CGGAGGCTTG 840 

GACGCGGCCG CATCGCGCCT GCGCGGCGGC GAAATCAGCC GCGACGGCCT GTGGACCCGA ■ 900 

CCGCGCGC 908 
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(2) INFORMATION FOR SEQ ID NO : 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 494 amino acic 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Ser Ser Ala lie Val lie Gly Ala Gly Phe Gly Gly Leu Ala Leu 
IS 10 15 

Ala lie Arg Leu Gin Ser Ala Gly lie Ala Thr Thr lie Val Glu Ala 
20 25 30 

Arg Asp Lys Pro Gly Gly Arg Ala Tyr Val Trp Asn Asp Gin Gly His 
35 40 45 

Val Phe Asp Ala Gly Pro Thr Val Val Thr Asp Pro Asp Ser Leu Arg 
50 55 60 

Glu Leu Trp Ala Leu Ser Gly Gin Pro Met Glu Arg Asp Val Thr Leu 
65 70 75 80 

Leu Pro Val Ser Pro Phe Tyr Arg Leu Thr Trp Ala Asp Gly Arg Ser 
85 90 95 

Phe Glu Tyr Val Asn Asp Asp Asp Glu Leu He Arg Gin Val Ala Ser 
100 105 110 

2bfi. S.m. TiK.n. ZXa. *sj?. Ha-L Q.1j< t^n iaj. tug. <2bs. V.vs, S^sp, 
115 120 125 

Glu Glu Val Tyr Arg Glu Gly Tyr Leu Lys Leu Gly Thr Thr Pro Phe 
130 135 140 

Leu Lys Leu Gly Gin Met Leu Asn Ala Ala Pro Ala Leu Met Arg Leu 
145 150 155 160 

Gin Ala Tyr Arg Ser Val His Ser Met Val Ala Arg Phe He Gin Asp 
165 170 175 

Pro His Leu Arg Gin Ala Phe Ser Phe His Thr Leu Leu Val Gly Gly 
180 185 190 

Asn Pro Phe Ser Thr Ser Ser He Tyr Ala 1 
195 2" 

Arg Arg Gly Gly Val Trp Phe A 
210 215 

'vjV VBti- Hii. .Vtcx T jJ&U Mi* 1 



l Gly His Thr Arg 
285 

Arg Gly Arg Thr Lys Ala Ala He Leu Asn Arg Gin Arg Trp Ser Met 
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Ala His His Ser Val He Phe Gly Pro Arg Tyr Lys Gly Leu Val Asn 
325 330 335 

Glu He Phe Asn Gly I 
340 

His Ser Pro Cys Val Thr Asp Pro Ser Leu Ala Pro Glu Gly Met Ser 
355 360 355 

Thr His Tyr Val Leu Ala Pro Val Pro His Leu Gly Arg Ala Asp Val 
370 375 380 

Asp Trp Glu Ala Glu Ala Pro Gly Tyr Ala Glu Arg He Phe Glu Glu 
385 390 395 400 

Leu Glu Arg Arg Ala He Pro Asp Leu Arg Lys His Leu Thr Val Ser 



Ala Gly Thr His Pro Gly Ala Gly He Pro Gly Val Val Gly Ser Ala 
465 470 475 480 

Lys Ala Thr Ala Gin Val Met Leu Ser Asp Leu Ala Val Ala 
485 490 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1482 base pairs 
(S) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii! MOLECULE TYPE : DNA (genomic) 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

ATGAGTTCCG CCATCGTCAT CGGCGCAGGT TTCGGCGGGC TTGCGCTTGC CATCCGCCTG 60 

CAATCGGCCG GCATCGCGAC CACCATCGTC GAGGCCCGCG ACAAGCCCGG CGGCCGCGCC 120 

TATGTCTGGA ACGATCAGGG CCACGTCTTC GATGCAGGCC CGACGGTCGT GACCGACCCC 180 

GACAGCCTGC GAGAGCTGTG GGCCCTCAGC GGCCAACCGA TGGAGCGTGA CGTGACGCTG 240 

CTGCCGGTCT CGCCCTTCTA CCGGCTGACA TGGGCGGACG GCCGCAGCTT CGAATACGTG 300 

AACGACGACG ACGAGCTGAT CCGCCAGGTC GCCTCCTTCA ATCCCGCCGA TGTCGATGGC 360 

TATCGCCGCT TCCACGATTA CGCCGAGGAG GTCTATCGCG AGGGGTATCT GAAGCTGGGG 420 

ACCACGCCCT TCCTGAAGCT GGGCCAGATG CTGAACGCCG CGCCGGCGCT GATGCGCCTG 480 

CAGGCATACC GCTCGGTCCA CAGCATGGTG GCGCGCTTCA TCCAGGACCC GCATCTGCGG 540 

CAGGCCTTCT CGTTCCACAC GCTGCTGGTC GGCGGGAACC CGTTTTCGAC CAGCTCGATC 600 

TATGCGCTGA TCCATGCGCT GGAACGGCGC GGCGGCGTCT GGTTCGCCAA GGGCGGCACC 660 
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AACCAGCTGG TCGCGGGCAT GGTCGCCCTG TTCGAGCGTC TTGGCGGCAC GCTGCTGCTG 720 

AATGCCCGCG TCACGCGGAT CGACACCGAG GGCGATCGCG CCACGGGCGT CACGCTGCTG 780 

GACGGGCGGC AGTTGCGCGC GGATACGGTG GCCAGCAACG GCGACGTGAT GCACAGCTAT 840 

CGCGACCTGC TGGGCCATAC CCGCCGCGGG CGCACCAAGG CCGCGATCCT GAACCGGCAG 900 

CGCTGGTCGA TGTCGCTGTT CGTGCTGCAT TTCGGCCTGT CCAAGCGCCC CGAGAACCTG 960 

GCCCACCACA GCGTCATCTT CGGCCCGCGC TACAAGGGGC TGGTGAACGA GATCTTCAAC 102 0 

GGGCCACGCC TGCCGGACGA TTTCTCGATG TATCTGCATT CGCCCTGCGT GACCGATCCC 1080 

AGCCTGGCCC CCGAGGGGAT GTCCACGCAT TACGTCCTTG CGCCCGTTCC GCATCTGGGC 1140 

oscuxistru - -rcuwrnsisus agixitagucc coaasmmr cosKGOxwr crrcuassw 

CTGGAGCGCC GCGCCATCCC CGACCTGCGC AAGCACCTGA CCGTCAGCCG CATCTTCAGC 1260 

CCCGCCGATT TCAGCACCGA ACTGTCGGCC CATCACGGCA GCGCCTTCTC GGTCGAGCCG 132 0 

ATCCTGACGC AATCCGCCTG GTTCCGCCCG CATAACCGCG ACCGCGCGAT CCCGAACTTC 1380 

TACATCGTGG GGGCGGGCAC GCATCCGGGT GCGGGCATCC CGGGTGTCGT TGGCAGCGCC 1440 

AAGGCCACGG CGCAGGTCAT GCTGTCGGAC CTGGCCGTCG CA 1482 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 382 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

!ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Ser His Asp Leu Leu He Ala Gly Ala Gly Leu Ser Gly Ala Leu 
15 10 15 

He Ala Leu Ala Val Arg Asp Arg Arg Pro Asp Ala Arg He Val Met 
20 25 30 

Leu Asp Ala Arg Ser Gly Pro Ser Asp Gin His Thr Trp Ser Cys His 
35 40 4S 

Asp Thr Asp Leu Ser Pro Glu Trp Leu Ala Arg Leu Ser Pro He Arg 
50 55 60 

Arg Gly Glu Trp Thr Asp Gin Glu Val Ala Phe Pro Asp His Ser Arg 
65 70 75 80 

Arg Leu Thr Thr Gly Tyr Gly Ser He Glu Ala Gly Ala Leu He Gly 
85 90 95 

Leu Leu Gin Gly Val Asp Leu Arg Trp Asn Thr His Val Ala Thr Leu 
100 105 110 

Asp Asp Thr Gly Ala Thr Leu Thr Asp Gly Ser Arg He Glu Ala Ala 
115 120 125 

Cys Val He Asp Ala Arg Gly Ala Val Glu Thr Pro His Leu Thr Val 
130 135 140 

Gly Phe Gin Lys Phe Val Gly Val Glu He Glu Thr Asp Ala Pro His 
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Gly Val Glu Arg Pro Met lie Met Asp Ala Thr Val Pro Gin Met Asp 
165 170 175 

Gly Tyr Arg Phe lie Tyr Leu Leu Pro Phe Ser Pro Thr Arg He Leu 



Ala Ser Ala Arg Arg Ala Val Arg Gly Trp Ala He Asp Arg Ala Asp 



Pro Pro Asp Arg Arg Tyr Arg Leu Leu Gin Arg Phe Tyr Arg Leu Pro 
325 330 335 

Gin Pro Leu He Glu Arg Phe Tyr Ala Gly Arg Leu Thr Leu Ala Asp 



Val Arg Cys Leu Pro Glu Arg Pro Leu Leu Gin Glu Arg Ala 
370 375 380 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1149 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
ATGAGCCATG ATCTGCTGAT CGCGGGCGCG GGGCTGTCCG GTGCGCTGAT CGCGCTTGCC 
GTTCGCGACC GCAGACCGGA TGCGCGCATC GTGATGCTCG ACGCGCGGTC CGGCCCCTCG 
GACCAGCACA CCTGGTCCTG CCACGACACG GATCTTTCGC CCGAATGGCT GGCGCGCCTG 
TCGCCCATTC GTCGCGGCGA ATGGACGGAT CAGGAGGTCG CGTTTCCCGA CCATTCGCGC 
CGCCTGACGA CAGGCTATGG CTCGATCGAG GCGGGCGCGC TGATCGGGCT GCTGCAGGGT 
GTCGATCTGC GGTGGAATAC GCATGTCGCG ACGCTGGACG ATACCGGCGC GACGCTGACG 
GACGGCTCGC GGATCGAGGC TGCCTGCGTG ATCGACGCCC GTGGTGCCGT CGAGACCCCG 
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CACCTGACCG TGGGTTTCCA GAAATTCGTG GGCGTCGAGA TCGAGACCGA CGCCCCCCAT 480 

GGCGTCGAGC GCCCGATGAT CATGGACGCG ACCGTTCCGC AGATGGACGG GTACCGCTTC 540 

ATCTATCTGC TGCCCTTCAG TCCCACCCGC ATCCTGATCG AGGATACGCG CTACAGCGAC 600 

GGCGGCGATC TGGACGATGG CGCGCTGGCG CAGGCGTCGC TGGACTATGC CGCCAGGCGG 660 

GGCTGGACCG GGCAGGAGAT GCGGCGCGAA AGGGGCATCC TGCCCATCGC GCTGGCCCAT 720 

GACGCCATAG GCTTCTGGCG CGACCACGCG CAGGGGGCGG TGCCGGTTGG GCTGGGGGCA 7 80 

GGGCTGTTCC ACCCCGTCAC CGGATATTCG CTGCCCTATG CCGCGCAGGT CGCGGATGCC 840 

ATCGCGGCGC GCGACCTGAC GACCGCGTCC GCCCGTCGCG CGGTGCGCGG CTGGGCCATC 900 

GATCGCGCGG ATCGCGACCG CTTCCTGCGG CTGCTGAACC GGATGCTGTT CCGCGGCTGC 960 

CCGCCCGACC GTCGCTATCG CCTGCTGCAG CGGTTCTACC GCCTGCCGCA GCCGCTGATC 1020 

GAGCGCTTCT ATGCCGGGCG CCTGACATTG GCCGACCGGC TTCGCATCGT CACCGGACGC 108 0 

CCGCCCATTC CGCTGTCGCA GGCCGTGCGC TGCCTGCCCG AACGCCCCCT GCTGCAGGAG 1140 

AGAGCATGA 1149 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A> LENGTH: 169 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
ID) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Ser Thr Trp Ala Ala lie Leu Thr Val He Leu Thr Val Ala Ala 
15 10 15 

Met Glu Leu Thr Ala Tyr Ser Val His Arg Trp He Met His Gly Pro 
20 25 30 

Leu Gly Trp Gly Trp His Lys Ser His His Asp Glu Asp His Asp His 
35 40 45 

Ala Leu Glu Lys Asn Asp Leu Tyr Gly Val He Phe Ala Val lie Ser 
50 55 60 

He Val Leu Phe Ala He Gly Ala Met Gly Ser Asp Leu Ala Trp Trp 
65 70 75 80 

Leu Ala Val Gly Val Thr Cys Tyr Gly Leu He Tyr Tyr Phe Leu His 
85 90 95 

Asp Gly Leu Val His Gly Arg Trp Pro Phe Arg Tyr Val Pro Lys Arg 
100 105 110 

Gly Tyr Leu Arg Arg Val Tyr Gin Ala His Arg Met His His Ala Val 
115 120 125 

His Gly Arg Glu Asn Cys Val Ser Phe Gly Phe He Trp Ala Pro Ser 
130 135 140 
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(2) INFORMATION FOR SEQ ID NO : 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 506 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

ATGAGCACTT GGGCCGCAAT CCTGACCGTC ATCCTGACCG TCGCCGCGAT GGAGCTGACG 60 

GCCTACTCCG TCCATCGGTG GATCATGCAT GGCCCCCTGG GCTGGGGCTG GCATAAATCG 120 

CACCACGACG AGGATCACGA CCACGCGCTC GAGAAGAACG ACCTCTATGG CGTCATCTTC 180 

GCGGTAATCT CGATCGTGCT GTTCGCGATC GGCGCGATGG GGTCGGATCT GGCCTGGTGG 240 

CTGGCGGTGG GGGTCACCTG CTACGGGCTG ATCTACTATT TCCTGCATGA CGGCTTGGTG 3 00 

CATGGGCGCT GGCCGTTCCG CTATGTCCCC AAGCGCGGCT ATCTTCGTCG CGTCTACCAG 360 

GCACACAGGA TGCATCACGC GGTCCATGGC CGCGAGAACT GCGTCAGCTT CGGTTTCATC 420 

TGGGCGCCCT CGGTCGACAG CCTCAAGGCA GAGCTGAAAC GCTCGGGCGC GCTGCTGAAG 480 

GACCGCGAAG GGGCGGATCG CAATAC 506 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 726 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

ATGTCCGGTC GTAAACCGGG TACCACCGGT GACACCATCG TTAACCTGGG TCTGACCGCT 60 

GCTATCCTGC TGTGCTGGCT GGTTCTGCAC GCTTTCACCC TGTGGCTGCT GGACGCTGCT 120 

GCTCACCCGC TGCTGGCTGT TCTGTGCCTG GCTGGTCTGA CCTGGCTGTC CGTTGGTCTG 180 

TTCATCATCG CTCACGACGC TATGCACGGT TCCGTTGTTC CGGGTCGTCC GCGGGCTAAC 240 

GCTGCTATCG GTCAGCTGGC TCTGTGGCTG TACGCTGGTT TCTCCTGGCC GAAACTGATC 300 

GCTAAACACA TGACCCACCA CCGTCACGCT GGTACCGACA ACGACCCGGA CTTCGGTCAC 3 60 

GGTGGTCCGG TTCGTTGGTA CGGTTCCTTC GTTTCCACCT ACTTCGGTTG GCGTGAAGGT 420 

CTGCTGCTGC CGGTTATCGT TACCACCTAC GCTCTGATCC TGGGTGACCG TTGGATGTAC 480 

GTTATCTTCT GGCCGGTTCC GGCTGTTCTG GCTTCCATCC AGATCTTCGT TTTCGGTACC 540 

TGGCTGCCGC ACCGTCCGGG TCACGACGAC TTCCCGGACC GTCACAACGC TCGTTCCACC 600 

GGTATCGGTG ACCCGCTGTC CCTGCTGACC TGCTTCCACT TCGGTGGTTA CCACCACGAA 660 
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CACCACCTGC ACCCGCACGT TCCGTGGTGG CGTCTGCCGC GTACCCGTAA AACCGGTGGT 720 
CGTGCT 726 



10 

Claims 

1. A process for the preparation of canthaxanthin by culturing under suitable culture conditions a cell which is trans- 
formed by a DNA sequence comprising the following DNA sequences: 

15 

a) a DNA sequence which encodes the GGPP synthase of Flavobacterium sp. R1534 (crtE) or a DNA 
sequence which is substantially homologous; 

b) a DNA sequence which encodes the prephytoene synthase of Flavobacterium sp. R1534 (crtB) or a DNA 
so sequence which is substantially homologous; 

c) a DNA sequence which encodes the phytoene desaturase of Flavobacterium sp. R1534 (crtl) or a DNA 
sequence which is substantially homologous; 

25 d) a DNA sequence which encodes the lycopene cyclase of Flavobacterium sp. R1534 (crtY) or a DNA 

sequence which is substantially homologous; 

e) a DNA sequence which encodes the p-carotene p4-oxygenase of the microorganism E-396 (FERM BP- 
4283) [crtW E396 ] or a DNA sequence which is substantially homologous; 

30 

or a cell which is transformed by a vector comprising DNA sequences specified above under a) to e) and by 
isolating canthaxanthin from such cells or the culture medium by methods known in the art. 

2. A process for the preparation of a mixture of adonixanthin and astaxanthin or adonixanthin or astaxanthin alone by 
35 a process as claimed in claim 1 characterized therein that in addition to the DNA sequences specified in claim 1 

under a) to e) the following additional DNA sequence is present: 

f) a DNA sequence which encodes the p-carotene hydroxylase of the microorganism E-396 (FERM BP-4283) 
[crtZ E396 ] or a DNA sequence which is substantially homologous; 

40 

and the DNA sequence specified under e) of claim 1 is as specified in claim 1 or the following sequence: 

g) a DNA sequence which encodes the p-carotene fH-oxygenase of Alcaligenes strain PC-1 (crtW) or a DNA 
sequence which is substantially homologous; 

45 

and isolating the desired mixture of adonixanthin and astaxanthin or adonixanthin or a astaxanthin alone from 
such cells of the culture medium and separating the desired mixture or carotenoids alone from other caroten- 
oids which might be present by methods known in the art. 

so 3. A process for the preparation of zeaxanthin by a process as claimed in claim 1 characterized therein that the DNA 
sequence as specified under e) is replaced by the DNA sequence as specified under f) in claim 2 and by isolating 
zeaxanthin from the cell or the culture medium and separating it from other carotenoids which might be present by 
methods known in the art. 

55 4. A process for the production of adonixanthin by culturing under suitable culture conditions a cell which is trans- 
formed by a DNA sequence comprising the following heterologous DNA sequences: 

a) a DNA sequence which encodes the GGPP synthase of the microorganism E-396 (FERM BP-4283) 
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[crtE E396 ] or a DNA sequence which is substantially homologous; 

b) a DNA sequence which encodes the prephytoene synthase of the microorganism E-396 (FERM BP-4283) 
[crtB E396 ] or a DNA sequence which is substantially homologous; 

5 

c) a DNA sequence which encodes the phytoene desaturase of the microorganism E-396 (FERM BP-4283) 
[crtl E396 ] or a DNA sequence which is substantially homologous; 

d) a DNA sequence which encodes the lycopene cyclase of the microorganism E-396 (FERM BP-4283) 
w [crtY E396 ] or a DNA sequence which is substantially homologous; 

e) a DNA sequence which encodes the p-carotene hydroxylase of the microorganism E396 (FERM BP-4283) 
[crtZ E396 ] or a DNA sequence which is substantially homologous; and 

15 f) a DNA sequence which encodes the p-carotene p4-oxygenase of the microorganism E396 (FERM BP-4283) 

[crtW E396 ] or a DNA sequence which is substantially homologous; 

and isolating adonixanthin from the cell or the culture medium and separating it from other carotenoids which 
might be present by methods known in the art. 

20 

5. A process for the preparation of a food or feed composition characterized therein that after a process as claimed in 
any one of claims 1 to 4 has been effected the carotenoid or carotenoid mixture is added to food or feed. 

6. A process as claimed in any one of claims 1 to 5 characterized therein that the transformed host cell is a prokaryotic 
25 host cell, like E. coli, Bacillus or Flavobacter. 

7. A process as claimed in any one of claims 1 to 5 characterized therein that the transformed host cell is a eukaryo- 
tice host cell, like yeast or a fungal cell. 
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Fig. 1 
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Fig. 5 
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Fig, 8 
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Fig. 9 

1 MTDLTATSEA AIAQGSQSFA QAAKLMPPGI REDTVMLYAW CRHADDVIDtj 

51 QVMGSAPEAG GDPQARLGAL RADTLAALHE DGPMSPPFAA LRQVARRHDF 

101 PDLWPMDLIE GFAMDVADRE YRSLDDVLEY SYKVAGWGV MMARVMGVQD 

151 DAVLDRACDL GLAFQLTNIA RDVIDDAAIG RCYLPADWLA EAGATVEGPV 

201 PSDALYSVII RLLDAAEPYY ASARQGLPHL PPRCAWSIAA ALRIYRAIGT 

251 RIRQGGPEAY RQRISTSKAA KIGLLARGGL DAAASRLRGG EISRDGLWTR 

301 PRA 
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Fig. 10 

1 MSSAIVIGAG FGGLALAIRL QSAGIATTIV EARDKPGGRA YVWNDQGHVT 

51 DAGPTWTDP DSLRELWALS GQPMERDVTL LPVSPFYRLT WADGRSFEYV 

101 NDDDELIRQV ASFNPADVDG YRRFHDYAZE VYREGYLKLG TTPFLKLGQM 

151 LNAAPALMRL QAYRSVHSMV ARFIQDPHLR QAFSFHTLLV GGNPFSTSSI 

201 YALIHALERR GGVWFAKGGT NQLVAGMVAL FERLGGTLLL NARVTRIDTE 

251 GDRATGVTLL DGRQLRADTV ASNGDVMHSY RDLLGHTRRG RTKAAILNRQ 

301 RWSMSLFVLH FGLSKRPENL AHHSVIFGPR YKGLVNEIFN GPRLPDDFSM 

351 YLHSPCVTDP SLAPEGMSTH YVLAPVPHLG RADVDWEAEA PGYAERIFEE 

401 LERRAIPDLR KHLTVSRIFS PADFSTELSA KHGSAFSVE? ILTQSAWFR? 

451 HNRDRAIPNF YIVGAGTHPG AGIPGWGSA KATAQVMLSD LAVA 
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Fig. 11 

1 MSHDLLIAGA GLSGALIALA VRDRRPDARI VMLDARSGPS DQHTWSCHDT 

51 DLSPEWLARL S? IRRGEWTD QEVAPPQHSR RLTTGYGSIE AGALIGLLQG 

101 VDLRWNTHVA TLDDTGATLT DGSRIEAACV IDARGAVETP HLTVGFQKFV 

151 GVEIETDAPH GVERPMIMDA TVPQMDGYRF IYLLPFSPTR ILIEDTRYSD 

201 GGDLDDGALA QASLDYAARR GWTGQEMRRE RGILPIALAH DAIGFWRDHA 

251 QGAVPVGLGA GLFHPVTGYS LPYAAQVADA IAARDLTTAS ARRAVRGWAI 

301 DRADRDRFLR LLNRMLFRGC PPDRRYRLLQ RFYRLPQPLI ERPYAGRLTL 

351 ADRLRIVTGR PPIPLSQAVR CLPERPLLQE RA 
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Fig. 12 

1 MSTWAAILTV ILTVAAMELT AYSVKRWIMH GPLGWGWHKS HKDEDHDHAL 
51 EKNDLYGVIF AVISIVLFAI GAMGSOLAWW LAVGVTCYGL IYYFLHDGLV 
101 HGRWPFRYVP KRGYLRRVYQ AHRMHHAVHG RENCVSFGFI WAPSVDSLKA 

151 ELKRSGALLK DREGADRNT 
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I — crt£ 

#100: 5 ■ raracarr ?.rr ^7gaacacaaa) c;ac2-AIGACGCCCAAGCAGCAGCAArTC 3' 
Spe! RBS Ndd 



#101: 5 ' TArATAC££fisS£jTCAGCCGCGACGGCCTGTGG 3 ' 
Smai 



#104: s ■ rar.ar ff-a^rrr £ ja^agcaqaaat :cacaLLIgAGCACrTGGGCCGCAATCC 3' 
EcoRi RBS Ndel 



#105: 5 ' GTTTCAGCTCTGCCTTGAGGC 3 ' 



crtZ — p- crtY 

MVT1: 5 1 GCGAAGGGGCGGATCGCAATAC gZfi faaagqaagac jJcgra ATGAGCCATGATCTGCTGATCG 3 ' 

Pmll 



MUT1: 5 ' SCCCCCTgCTGCAGCASAGAGCtl feaagcaqqcai agagAIgAGTTCCGCCATCGTCATCG 3 ' 

Muni 



crtl *j | » crtB 

MUT3: 5 1 GGTCATGCTGTCGGACCTGGCCGTCGCtZ fla,aagqaggar e|e a cATG AC CG ATC TG ACGGC GAC TTCC 3 ' 

BvnHl 



MUT5: 5 ' ATATATcrca 3fcXHCC>:cc"~3c aaGCTCTCTCCTGCAGCAGGG 3 
M<mJ U— ertY 



MUT6: 5 • azqaztflaai£C-cc«ixaaGCGACGGCCAGGTCCGACAGC 3 ' 
BvbKI L« — crtX 



CAR17 5 ' CAGAACCCATCACCTGCCCGTC 3 • 



a! 3 : 5 ' CGCfiAillCTCGCCGGCAATAGTTACC 3 ' 
EcoRl 



at«: 5 ' GTCACATGCATG£i2CJ;TAC£A£i^^ATAAGCATGTG^£CjlC?TCAACTAACGGGGCAGG 3 ' 
Sphi Sacl /uiQ 
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Fig, 20/2 
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Fig- 24A 



CTAAAT?GTAAGCGT?XArArr7TGrT.\AAATrCSCG?TAAArrrrTGT?AAATCAGCTC 

! -f * -r + 60 

ga?ttaacat?cgcaa-~a?aaaacaa?t??aagcgcaatt7aa^ 

Arrrrr:AACCAA?AGGCCGA^.TCGGC\AAA?CCCr?ATAAATCAAAAGAATAGACCGA 

S1 -r * -r e 120 

TAAAAAATTGGTTA?CCGGCTrrAGCCGTTTTAGCGAA?ATTrAGTT7TCTrATCTSGC7 

GATAGGGTTGAGTGTTGiTCCAGTTT'GGAACAAGAGTCCACTArrAAAGAACGTGGACTC 

121 • + + 180 

CTATCCCAACTCACAACAAGGTCAAACCTTGTTCTCAGGTGATAATTTCTTGCACCTGAG 



131 h ► 1 ► 240 

GTrGCAGTTTCCCGCTTTTTGGCAGATAGTCCCGCTACCGGGTGATGCACTTGGTAGTGG 

CTAATCAA O TTT T TTGGGGTCGAGGTGCCGTAAAGCACTAAATCGGAACGCTAAAGGGAG 

241 ► * + ► •■ ► 300 

GATTAGTTCAAAAAACCCCAGCTCCACGGCATrrCGTGATTTAGCCTTGGGATTTCCCTC 

CCCCCGATTTAGAGCrTGACGGGGAAAGCCGGCGAACGTGGCGAGAAAGGAAGGCAAGAA 

GG<WGCrAAA7CTCGAACTGCCCCTrTCGGCCGCTTGCACCGCTCTTTCCTtCCCTTCTT 

AGCGAAAGGAGCGGGCGCTAGGGCGCTGGCAAGTGTAGCGGTCACGCTGCGCGTAACCAC 

351 * + * h < + 420 

TCGCTTrCCTCGCCCGCGATCCCGCGACCGTTCACATCGCCAGTGCGACGCGCATTGGTG 

CACACCCGCCGCGCTTAATCCGCCGCTACAGGGCCCGTCCCATTCGCCATTCAGGCTGCG 

421 * * 4 f 480 

GTGTGGGCGGCGCGAATTACGCGCCGA7GTCCCGCGCAGGGTAAGCGGTAAGTCCGIACGC 

CAACTGrrGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGG 

481 ► ^ . i ► S40 

GTTGACAACCCrrCCCGCTAGCCACCCCCGGAGAAGCGATAArGCGGTCGACCGCTTTCC 

GGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTG 

541 ♦ h ~-i + 600 

CCCTACACGACGTTCGGCTAATTCAACCCATTGCGGTCCCAAAAGGGTCAGTGCTGCAAC 

TAAAACGACGGCCAGTGAGCGCGCGTAATACGACTCACTATAGGGCGAArTGGAGCTCCA 

ATTTTGCTGCCGGTCACTCGCGCGCATTATGCTGAGTGATAiCCCGCTTAACCTCGAGGT 

CCGCGGTCGCGGCCGCTCTAGTGGA TCCGCGCCTGCCCCTT CGCGATCAGCAGCCGCCCT 

5 51 + + r ► 720 

GGCGCCACCGCCGGCGAGATCACCTAGGCGCGGACCGGCAAGCGCTAGTCGTCGGCGGGA 

TGCGGATCGGTCAGCATCATCCCCATGAACCGCAGCGCACGACGCAGCGCGCGCCCCAGA 

721 r * 780 

ACGCCTAGCCAGTC3TA3TAGGGGTACTTGGCGTCGCGTGCTGCGTCGCGCGCGGGGTCT 

TCGGGCGCGTCCACCACGCCAI'GCGCCATCATCGCGAAGGCCCCCGGCGCCATGGGGCGC 

781 * + 840 

AGCCCGCGCAGGTCGTGCCCrACGCGG-ACTAGCGCTTCCGGGGGCCGCCGTACCCCGCG 

GTGCCCATTCCGAAGAACTCGCAGCCTGTCCGCTCCGCAAGGTCGCGCCAGATCGCGCCG 

3„1 t * * * 900 

CACGGGTAAGGCTTCrrGAGCGTCGGACACGCGACGCGTTCCAGCGCGGTCTAGCGCGGC 

TATTCCGATGCACTGACGCGCCCGATCCCCGTGGCCCCGCCCTGCCCCCCCGCCACCAGC 

901 — * 950 

A7AAGGC?ACGTCACTGCCCGGGCTACGCGCACCCGGGCGGGACGGGGCGGCGGTGGTCG 
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(XATCCCGCACGAACCCTTCCGACATGATGTGC-GATCCATGGCCCGTCArrGCAAAACC 
. + * ^ + + . r 102Q 

CGTAGCGCGTGCTTGGGAAGGC7CTACTACACGACTAGGTACCGGGCAGTAAC3TTTTGG 
GATCACCGATCCTGTCGCGTGATGGCATTGTTTGCAArGCCCCGAGGGCTAGCATGCCGC 



CCGCATCGGGTCTGGGGGCGGCCTCGGCGCGGATGCTGGCCCAAGGC5GCGCGAAGGTCG 

1141 ' + + + ► 1 (. 1,200 

GGCGTAGCCCAGACCCCCGCCGGAGCCGCGCCTACGACCC»?rrCCGCCC^CTTCCAGC 

TGCTGGCCGATCTGGCCGAACCGAAGGACGCGCCCGAACGCGCSGTTCACGCGGCCTGCG 

1201 + * ► + 1 i. 12S0 

ACGACCGGCTAGACCGCC 



TGCACTGGCTGCGCTGGCGACGCG'tCTGCCGGTAGCGCGACCGCTGGCTGGCGAAGCCGT 

GGCTGGACGCCCTTGTGAACTGCGCGGGCATCGCGCCGGCCGAACGGATGCTGGGCCGCG 

1321 ► ► ► «. 1 > 1330 

CCGACCTGCCCWAAC^CrrGACGCGCCCCTAC^CG^ 

A^GGGCCGCATGGACTGGACAGCTTTGCCCGTGCGGTCACGATCAACCTGATCGGCAGCT 

1381 — , > ► , k 1440 

TGCCCGGCGTACCTGACCrGTCGAAACGCGCACGCCAGTGCTAGTTGGACTAGCCGTCGA 

TCAACATGGCCCGCCTTGCAGCCGAGGCGATGGCCCGGAACGAGCCCGTCCGGGGCGAGC 
1441 1 - ► , f ► 1500 



rTAGCCTCTCCAGC 

CCTATGCGGCCAGCAAGGCGGGCGTGGCGGGCATGACGCTGCCGATGGCCCGCGACCTTG 

1S«1 , y «. ; , , y 1S2Q 

GGAXACGCCGGTCGTTCCGCCCGCACCGCCCGTACTOCGACGGCTACCGGGCGCTGCAAC 

CGCO^CGGCATCCCKGTCaTGACCATCGCGCCCGGCATCTTCCa»CC(XG^ 

1S21 s. , „ „ 1680 

GCGCCGTGCCGTAGGCGCAGTACTGGTAGCGCGGGCCGTAGAAGGCGTGGGGCTACGACC 



TCCCCGACGGCGTCCTGCAAGTCCTGTCGGACCCGCGCCGCCACGGGAAGGGGAGCGCCG 



ACCCTCTCGGCAGCCTTATGCGCCGCGACAACGTGCTGTAGTAGCGCTTGGGGTACGACT 

ACGGAGAGGTCATCCCCCTCGACGGCGCATTCCGCATGGCCCCCAAGTGAAGGAGCGTTT 

1301 + :_- , + r 13S0 

TGCCTCTCCACTAGGCGGAGCTGCCGCGTAACGCGTACCGGGGGTTCACTTCCTCGCAAA 

CATGGACCCCATCGTCATCACCGGCCCGATGCGCACCCCGATGGGGGCAtTCCAGGGCGA 

1361 t * «. «. + 1920 

GTACCTGGCCTAGCAGTAGTGGCCGCGCTACGCCTGCGGCTACCCCCGTAACGTCCCCCT 

rCTTGCCGCGATGGATGCCCCGACCCTTGGCGCGGACGCGArCCGCGCCGCGCTGAACGG 

1921 — * + 1930 

AGAACGGCGCTACCTACCCGCCTGCCAACCGCGCCTGC3C7AGCCGCGGCGCGACTTGCC 
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CCTGTCGCCCGACATGGTGGACGAGCrGCTGATGGGCTGCGTCCTCGCCGCGGGCCAGCC 
U81 — * + «. „ , 204Q 

GGACAGCGGGCTGTACCACCTGCTCCACGAC7ACCCGACGCAGGAGCGGCGCCCGGTCCC 

TCAGCCACCGCCACGTCAGGCGGCGCTTGGCGCCGGACTGCCGCTGTCGACGGGCACGAC 
2041 * * + + . N 2L0Q 

AGTCCGTGGCCGTGCAGTCCGCCGCGAACCGCSGCCTGACGGCGACAGCTGCCCGTGCTG 

CACCATCAACGAGATGTGCGGATCGGGCATGAAGGCCGCGATGCTGGGCCATGACCTGAX 
2 10l , „ + + ^ 

G7GGTAGT7GCTCTACACGCCTAGCCCGTACTTCCGGCGCTACGACCCGGTACTGGACTA 

CGCCGCGGGATCGGCGGGCATCGTCGTCGCCGGCGGGATGGACAGCATGTCGAACGCCCC 
2lSl , + + + ( h 222Q 

GCGGCGCCCTAGCCGCCCGTAGCAGCAGCGGCCGCCCTACCTCTCGTACAGCTTGCGGGG 

CTACCTGCTGCCCAAGGCGCGGTCGGGGATGCGCATGGGCCATGACCGTGTGCTGGATCA. 
2221 k * + , , , 22ao 

GATGGACGACGGGTTCCGCGCCAGCCGCTACGCGTACCCGGTACTGGCACACGACCTAGT 

CATGTTCCTCCACGGGTTGGAGGACGCCTATGACAAGGvJCCGCCTGATGGGCACCrrCGC 
22 ai K ♦ + „ h , 2340 

GTACAAGGAGCTGCCCAACCTCCTGCGGATACTGTTCCCGGCGGACTACCCGTGGAAGCG 

CGAGCATTGCGCCGGCGATCACGGTTTCACCCGCCAGGCGCAGGACGACTATGCGCTGAC 

2341 + + ► + , 2400 

GCrCCTAACGCGGCCGCTAGTGCCAAAGTGGGCGCTCCGCGTCCTGCTGATACGCGACTG 

CAGCCTGGCCCGCGCGCAGGACGCCATCGCCAGCGGTGCGTTCGCCGCCGAGATCGCGCC 
2401 h + + , „ 24S0 

GTCGGACCGGGCGCGCGTCCTGCGGTAGCGGTCCCCACGGAAGCCGCGGCTCTAGCGCGG 

CGTGACCGTCACGGCACGCAAGGTGCAGACCACwGTCGAXACCGACGAGATGCCCGGCAA 
24 SI y p + h , , 2S20 

GCACTGGCAGTGCCGTGCGTTCCACGTCTGGTGGCAGCTATGGCTGCTCTACGGGCCGTr 

GGCCCGCCCCGAGAAGATCCCCCATCTGAAGCCCGCCTTCCGTGACGGTGGCACGGTCAC 
2521 k + , , , h 2530 

CCGGCCGGGGCTCTTCTAGGCGGTAGACTTCGGGCGGAAGGCACTGCCACCGTGCCAGTG 

GGCGGCGAACAGCTCGTCGATCTCGGACGGGGCGGCGGCGCTGGTGATGATGCGCCAGTC 
2581 + + * „ + > 2S40 

CCGCCGCTTCTCGAGCAGCTAGAGCCTGCCCCGCCGCCGCGACCACTACTACGCGSTCAG 

GCAGGCCGAGAACCTGGGCCTGACGCCGATCGCGCGGATCATCGGTCATGCGACCCATGC 
2S41 * ♦ + + + , 2700 

CGTCCGGCTCTTCGACCCGGACTGCGGCTAGCGCGCCTAGTAGCCAGTACGCTGGGTACG 

CGACCGTCCCGGCCTCTTCCCGACGCCCCCCATCGGCGCCATGCGCAAGCTGC7GGACCG 

2701 + * + <. , + 2760 

GCTCGCACGGCCGGACAAGCGCTGCCGGGGGTAGCCGCGCTACGCGTTCGACGACCTGGC 

CACCGACACCCGCCTTGGCGATTACGACCTC7TCGACGTGAACGAGGCATTCGCCGTCGT 

27S1 A + + 2820 

GTGCCTCTGGGCGGAACCGCTAATCCTGGACAAGCrCCACTTGCTCCGTAAGCGGCAGCA 

CGCCATGATCGCGATGAACGACCTTGGCCTGCCACACGATGCCACGAACATCAACGGCGC 
2321 * ; * . + + + 2880 

GCGC7ACTAGCGCTACTTCCTCGAACCGGACGGTGTGCTACGGTGCTTGTAGTTGCCGCC 

GGCCTCCCCGCTTGGGCATCCCATCCGCGCGTCGGGGCCCCGGATCATCGTCACGCTGCT 
2881 * * . + + 2940 

ccggacccgccaacccctagcct:agccgcgcaccccccgcgcctagtaccagtgcgacga 
gaacgcgatggcggcgccgggcgcgacccccgcggccgcatccgtctgcatcggccgggg 

294! - * + T + 3000 

CTTGCGCTACCGCCGCGCCCC3CGCTGCCCGCCCCCCCGTAGGCACACGTAGCCGCCCCC 
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WAGGCGACGGCCATCGCGCTGGAACGGeTGAGCTAATTCATTTGCGCGAATCCCCGTTT 



GCTCCGCTGCCGGTAGCaCGACCTTGCCGACrCGArrAAGTAAACGCGCTTAGGCGCAAA 
TTCGTGCACGA7GGGGGAACCGGAAACGGCCACGCCTGTTG7GGTTGCGTCGACCTGTC7 



AAGCACGTGCTACCCCCTTGGCCtTTGCCGGTGCGGACAACACCAACGCAGCTGGACAGA 
TCGGGCCATGCCCOTGACGCGATGTGGCACCCGCATGGGGCGTTGCCGATCCGGTCGCAT 



CrGACTGCGTTGCTTCCGTGGCTACTGCaGGTTCGTCGTTAAGGGGGATGCGCTAGACCA 

CGAGATCAGGCTGGCGCAGATCTCGGGCCAGTTCGGCGTGGTCTCGGCCCCCCTCGGCGC 
3241 v ♦ + + „ „ 330Q 

GCTCTAGTCCGACCGCGTCTAGAGCCCGGTCAAGCCGCACCAGAGCCGGGGCGAGCCGCG 

GGCCATGAGCGATGCCGCCCTGTCCCCCGGCAAACGC7TTCGCGCCGTGCTGATGCTGAT 
3301 + * + ^ + „ 3360 

CCGGTACTCGCT ACGGCGGG AC AGGGGGCC G7T7GCGAAAGCGCCGCACGACTACGACTA 

GCTCGCCGAAAGCTCGGGCGGGGTCTGCCATGCGATGGTCGATGCCGCCTGCGCGGTCGA 
335! * + + + + 342Q 

CCAGCGGCTTTCGAGCCCGCCCCAGACGCTACGCTACCAGCTACGGCGGACGCGCCAGCT 

GATGGTCCATGCCGCATCGCTGATCrTCGACGACATGCCCTGCATGGACGATGCCAGGAC 

3421 + * 1 , k 34-ao 

CTACCAGGTACGGCGTAGCGACTAGAAGCTCCTGTACGGGACGTACCTGCTACGGTCCTG 

CCGTCGCGGTCAGCCCGCCACCCATGTCGCCCATGGCGAGGGGCGCGCGGTGCrTGCGGG 

3431 + + ♦ ► , h 3540 

GGCAGCCCCAG7CGGGCGG7GGGTACAGCGCGTACCGCTCCCCGCGCGCCACGAACGCCC 

CATCGCCCTGA7CACCGAGGCCA7GCGGATT77GGGCGAGGCGCGCGGCGCGACGCCGGA 

3S41 ♦ * +. + f ¥ 3S00 

GTAGCGGGAC7AG7GGCTCCGG7ACGCC7AAAACCCGC7CCGCGCGCCGCGCTGCGGCCT 

TCAGCGCGCAACGCTGGTCGCATCCATCTCGCGCGCGATGGGACCGGTGGCGCTGTGCGC 
3501 + ► + , 3560 

AGTCGCGCGTTCCGACCAGCGTAGGTACAGCGCGCGCTACCCTGGCCACCCCGACACGCG 

AGG&CAGGATCTGGACCXGOOXXnaUAGGACGCCr^^ 
3S61 * * «. + 4 + 372Q 

TCCCGTCCTAGACCTGGACGTGCGGGGGTTCCTGCGGCGGCCCTAGCTTGCACTrGTCCT 
CCTCAAGACCGGC3TGCTGTTCG7CGCGGGCCTCGAGATGCTGTCCATTATTAAGGGTCT 



GGAGTTCTGGCCGCACGACAAGCAGCGCCCGCAGC7CTACGACAGGTAATAATTCCCAGA 

GGACAAGCCGGAGACCGACCAGC7CA7GGCC7TCCGGCGTCAGCTTGGTCGGGTCTTCCA 

3781 * ♦ . + _+ 3840 

CCTGTTCCGGCTGTGGCTCGTCGAGTACCGGAAGCCCGCAGTCGAACCAGCCCAGAAGGT 

G7CC7ATGACCACCTGCTGGACG7GATCGGCGACAAGGCCAGCACCCGCAAGGATACGGC 

3841 + , + + + 3900 

CAGGATACTGCTGGACGACCTGCACTACCCCCTGTTCCGGTCGTGGCCGTTCCTATGCCG 

GCGCGACACCGCCGCCCCCGGCCCAAACGGCCGCCTCATGGCGGTCGGACAGATGCGCGA 

3901 * * + + + 39so 

CGCGCTC7GCCGGCGGGGGCCCGGTTTCCCCCCGGAC7ACCCCCAGCC7GTCTACCCCCT 

CGTGGCCCAGCArrACCGCGCCAGCCGCGCGCAACTGCACGAGCTGATGCGCACCCGGCT 
3951 * — 4, + 4Q20 

GCACCGCGTCGTAATGCCGCGG7CGGCGCGCG77GACC7GCTCGACTACGCGTGCGCCGA 
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GTTCCGCGGGGCGCAGATCCCC-GACCTGCTCGCCCGCCTCCTGCCGCATGACATCCCCCG 
4021 * + + 4Q80 

CAAGGCGCCCCCCGTCTAGCGCCTGGACGAC^GGGCGCACGACGGCGTACTGTAGGCGGC 

CAGCGCCTAGGCGCGCGGTCGGGTCCACACGCCGTCGCGGCTGAT7TCGCCGCCGCGCAG 
4081 , _ ^ + 414(J 

GTCGCGGATCCGCCCGCCACCCCAGGTGTCCGGCAGCGCCGACTAAAGCGGCGGCGCGTC 

GCGCGATGCGGCCGCGTCCAAGCCTCCGCGCGCCAGAAGCCCGATCTTGGCAGCCTTCGA 
4141 + * + + ^ ^ 42(J0 

CGCGCTACGCCGGCGCAGG-TCGOAGGCGCCCGGTCTTCGGGCTAGAACCGTCGGAAGCr 

CGTGCTGATCCGCTGGCGATAGGCCTCGGGGCCACCCTGCCGGATGCGCGTCCCGATTGC 
4201 . * + . h h 42SQ 

GCACGAC7AGGCGACCGCTATCCGGAGCCCCGGTGGGACGGCCTACGCGCAGGGCTAACG 

GCGATAGATACGCAGCGCGGCGGCGATCGACCACGCGCAGCGCGGCGGCAGATGCGGAAG 
42S1 c + * + , 432Q 

CGCTATCTAXGCGTCGCGCCGCCGCTAGCTGGTGCGCGTCGCGCCGCCGTCTACGCCTTC 

CCCCTGCCGCGCCGAGGCATAATAGGGCTCGGCCGCGTCAAGCAGGCGGArGATGACGGA 
4321 * + * „ h „ 43aQ 

GGGGACGGCGCGGCTCCCTATrATCCCGAGCCCGCGCAGTTCGTCCGCCTACTACTGCCT 

ATAGAGCGCGTCCGAAGGCACCCGACCCTCAACCGTCGCCCCCGCCTCGGCCAGCCAGTC 

TATCTCGCGCAGGCTTCCGTGGCCTGGGAGTTGGCAGCGGGGGCGGACCCGGTCGGTCAG 

GGCAGGCAGATAGCAGCGCCCGATGGCGGCATCGTCGATCACGTCGCGAGCGATGTTCGT 
, + + + + + + 45<)0 

CCGTCCGTCTATCGTCGCGGGCTACCGCCGTAGCAGCTAGTGCAGCGCTCGCTACAAGCA 



4381 



4440 



4501 • 



45S1 



4621 



4S31 



CAGCTGGAACGCAAGGCCCAGATCGCAGGCGCGATCCACCACCGCATCGTCCTGCACGCC 
GTCGACCTTGCGTTCCGGGTCTAGCGTCCGCGCTAGGTCGTGGCGTAGCAGGACGTGCGG 
CATCACCCGCGCCATCATCACGCCCACGACCCCCGCGACGTGOTAGGAATATTCCAGCAC 
GTAGTGGGCGCGGTAGTAGTGCGGGTGCTGGGGGCGCTGCACCATCCTTATAAGGTCGTG 
GTCATCCAGGCTGCGGTATTCGCGATCCGCGACATCCATCGCGAAACCCTCGATCAGGTC 
CAGTAGGTCCGACCCCATAAGCGCTAGGCGCTGTAGGTAGCGCTTTGGGAGCTAGTCCAG 
CATCGGCCAAAGGTCCGGGAAATCATGCCGCCGGGCGACCTGGCGCAGCGCCGCGAAGGG 



' 4560 



• 4520 



► 4S80 



GTAGCCGGTTTCCAGGCCC" 



i- 4740 



CGGCGACATCGGGCCGTCCTCGTGCAGCGCGGCCAGCGTGTCGGCGCGCAGCGCCCCCAG 
4741 * * + , <. „ 4300 

GCCGCTGTAGCCCGGCACGAGCACCTCGCGCCGGTCGCACAGCCGCGCGTCGCGGGGGTC 

CCGCGCCTCTCGCTCGCCGCCCGCCTCGGGGCCAGAACCCATCACCTGCCCGTCGATCAC 
4801 4 ■+ + 43S0 

GGCGCGGACACCCAGCGCCGGGCGGAGCCCCCGTCTTGGCTAGTGGACGGGCAGCTAGTG 

GTCATCCGCATCCCTGCACCACGCATAGAGCATGACCGTATCCTCGCGGATGCCGGGCGG 
48S1 * r + 4920 

CAGTAGGCGTACGGACGTGCTCCGTATCTCGTACTGGCATAGGAGCGCCTACCGCCCGCC 

CATCAGCTTGGCCGCCTGCGCGAAGCTTTGCGAACCCTGCGCGA7GGCCGCTTCGGAAGT 
4921 ^ _^ + 498Q 

GTAGTCCAACCGCCCGACGCGCTTCGAAACGCTTGGCACGCGCTACCGGCGAAGCCTTCA 

CGCCGTCAGATCCGTCATGCGACGGCCAGCTCCGACACCATGACCTGCGCCGTGGCCTTG 
4981 - , ^ 504Q 

GCGGCAGTC7AGCCAGTACGCTGCCCGTCCAGGCTCTCGTACTGGACGCCGCACCCGAAC 
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GCGCTGCCAACGACACCCCGGATGCCCGCACCCGGATGCGTGCCCGCCCCCACGATGTAG 

5041 + * + 3100 

CGCGACGGTTGCTGTGGGCCCTACGGGCGTGGGCCTACGCACGGGCGGGGGTGCTACArC 

AAGTTCGGGATCGCGCGGTC3CGGTTATGCGGGCGCAACCAGGCGGATTGCGTCACGATC 

S101 + ^ + + * k SISO 

TTCAAGCCCTAGCGCGCCAGCGCCAATACGCCCGCCTTGGTCCGCCTAACGCAGTCCTAG 

GGCTC5ACCGAGAAGGCGCTGCCGTGATGGGCCGACAGTTCGGTGCTGAAAXCGGCGGGG 

51 51 * + * 1. 5220 

CCGAGCTGGCTCTTCCGCGnCGGCACTACCCGGCTGTCAAGCCACGACTTTAGCCGCCCC 

CTGAAGAffGCGGCTGACGGTCAGGTGCTTGCGCAGGTCGGGGATGGCGCGGCGCTCCAGT 

S221 + + + 1. u 5230 

GACTTCTACGCCGACrGCCAGTCCACGAACGCGTCCAGCCCCTACCGCGCCGCGAGGTCA 

TCCTCGAAGATGCGCTCGGCATAGCCCCGGGCCTCGGCTTCCCAATCGACATCGGCGCGG - 

5281 * + + + „ 5340 

AGGAGCrrCTACGCGAGCCGTATCGGGCCCCGGAGCCGAAGGGTTAGCTCTAGCCGCGCC 

CCCAGATGCGGAACGGGCGCAAGGACGTAATGCGTGGACATCCCCTCGGGGGCC&GGCTG 

3341 + + + k * 5400 

GGGTCTACGCCTTGCCCCCGTTCCTGCATTACGCACCTGTAGGGGAGCCCCCGGTCCGAC 

GGATCGGTCACGCAGGGCGAATGCAGATACATCGAGAAATCGTCCGGCAGGCGTGGCCCG 

5401 * + * + c c 54S0 

CCTAGCCAGTGCGTCCCGCTTACGTCTATGTAGCTCTTTAGCAGGCCGTCCGCACCGGCC 

54 51. + -+I c SS20 

AACTTCTAGAGCAAGTGGTCGCGGAACATCGCGCCCGGCTTCTACTCCGACACCACCCGG 

AGGTTCTCGGGGCGCTTGGACAGGCCGAAATGCAGCACGAAGAGCGACATCGACCAGCGC 

5521 * + » ► 1. 5580 

TCCAAGAGCCCCGCCAACCTGTCCGGC-TTACGTCGTCCTTGTCGCTGTAGCTGGTCGCG 

TGCCGGTTCAGGATCGCGGCCTTGGTGCGCCCGCGGCGGGTATGGCCCAGCAGGTCGCGA 

5 SSL + + 1- + k f 5S40 

ACGGCCAAGTCCTAGCGCCGGAACCACGCGGGCGCCGCCCATACCGGGTCGTCCAGCGCT 

TAGCTGTGCATCACGTCGCCGTTCCTGGCCACCGTATCCGCGCGCAACTGCCGCCCGTCC 

ATCGACACGTAGTGCAGCGGCAACGACCGGTGGCATAGGCGCGCGTTGACGGCGGGCAGG 

AGCAGCGTGACGCCCGTGGCGCGATCGCCCTCGGTGTCGATCCCCGTGACGCGGGCATTC 

5701 + + + + c 5750 

TCGTCGCACTGCGGGCACCGCGCTAGCGGGAGCCACAGCTAGGCGCACTGCGCCCGTAAG 

AGCAGCAGCGTGCCGCCAAGACGCTCGAACAGGCCGACCATGCCCGCGACCAGCTGGTTG 

3751 * * + «. + ¥ 5820 

TCGTCGTCGCACGGCGGTTCTGCGACCTTGTCCCGCTGGTACGGGCGCTGGTCGACCAAC 

GTGCCGCCCTTGGCGAACCAGACGCCGCCGCGCCGTTCCAGCGCATGGATCAGCGCATAG 

5821 * + + - : r 5380 

CACGGCGGGAACCGCTTGG7C70CGCCGCCGCGGCAAGGTCGCGTACCTACTCGCGTATC 

ATCGAGCTGCTCGAAAACGGGTTCCCGCCGACCAGCACCCTGTGCAACGACAAGGCCTGC 

5881 , * + 5940 

TAGCTCGACCAGCTTTTGCCCAAGGGCGGCTGGTCGTCGCACACCTTGCTCTTCCGGACG 

CGCAGATGCGGGTCCTGGA-GAAGCtGCCCCACCATGCTGTGGACCGAGCGGTATGCCTGC 

5941 + — r +. ♦ 5000 

GCGTCTACGCCCAGGACCTAC-rtCGCGCGGTGGTACGACACCTGGCTCGCCATACGGACG 

AGGCGCATCAGCGCCCGCGCGCCCTTCACCATCTGCCCCAGCTTCAGGAAGCGCGTGGTC . 

S001 * „ + S050 

TCCCCGTAGTCGCGCCGCC.-:CC3CAAC?CG?AGACCGCGTCGAAGTCCTTCCCGCACCAG 
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CCCAGC77CAGA7ACCCC7GGCGATACACCTCCTCGCCC7AA7CGTGGAAGCGGCGA7AG 
5061 + _ S120 

CWGTCGAAC7C?ArGCCGAC~CC7ATC7CGAGCAGCCCCAT7AGCACC77CGCCGC7ATC 

CCATCGACATCGGC GGGA~7GAAGCAGCC3ACC7CGCGGA7CAGCTCG7CG7CG7CGT7C 

5X21 + S130 

GG7AGC7G7AGCCGCCC7AAC77CC7CC0C7GGACCGCCTAG7CGAGCAGCAGCAGCAAG 

ACGTATTCGAAGC7GCGGCCG7CCGCCCA7G7CAGCCGGTAGAAGGGCGAGACCGGCAGC 

S13 i . , r 6240 

7GCA7AAGC77CGACGCCGGCAGGCGGG7ACAG7CGGCCA7C7TCCCGCrCTGGCCG7CG 

AGCG7CACG7CACGC7CGA7CGG77GGCCGC7GAGGGCCCACAGC7C7CGCAGGC7G7CG 
S24 1 * , + S300 

7CGCAG7GCAGTGCGAGCTACCGAACCGGCGAC7CCCGGGTGTCGAGAGCGTCCCACAGC 

CCCAGCCAG7GC7GGCAGCCCGGACC7ACC77C7GCACCGGGACTAGCAAGGTCTGTATC 

GCGCGGCCGCCGGGCT7G7CGCGGGCC7CGACGATGG7GGTCGCGATGCCGGCCGATTGC 

63S1 * + * g 42 0 

CGCGCCGGCGGCCCGAACAGCGCCCGGAGC7CC7ACCACCAGCGCTACGGCCGGCTAACG 

AGGCGGA7CGCAAGCGCAAGCCCGCCGAAACC7GCGCCGATGACGATGGCGGAACTCATG 

6421 + 1. * S430 

TCCGCCTACCGTTCGCGTTCGGGCGGCTTTGGACGCGCCTACTGCTACCGCCTTGAGTAC 

CTCTCTCCTGCAGCACGGGGCGTTCSGGCAGGCAGCGCACGGCCTGCGACAGCGGAATGG 

5481 f ~ * * i. S54Q 

GAGAGACGACGTCG7CCCCCCCAAGCCGGTCCGTCGCG-GCCGGACGCTGTCGCCTTACC 

GCGGGCC7CCCG7GACGATGCGAAGCCGG7CCGCCAA7CTCAGGCGCCCGGCATAGAAGC 

6541 * . + + S600 

CGCCCGGAGGCCACTGC7AC0C77CGGCCAGCC5G77ACAG7CCGCGGGCCGTATCTTCG 

GCTCGArCAGCGGCTGCGGCAGGCGGTAGAACCGCTGCAGCAGGCGATAGCGACGGTCSG 

6501 * u * ssso 

CCAGCTACTCCCCCACGCCGTCCGCCA7C77GGCCACG7CG7CCGC7ATCGCTGCCAGCC 

GCGGGCAGCCGCGCAACAGCA7CCGG77CAGCAGCCGCAGGAAGCGGTCCCCATCCGCGC 

6661 * * + * 5720 

CGCCCG7CGCCGCC77C7CG7AGCCCAAC7CG7GGGCG7CC77CGCCAGCGCTAGGCGCG 

GATCGA7GGCCCAGCCGCGCACCGCGCGACGCGCCGACGGGG7CG7CAGGTCGCCCGCCC 

6721 * + ~ 5730 

CTAGCTACCGGGTCGGCGCGTGGCGCGCtGCCCGCGTGCGCCAGCAGTCCAGCGCGCGGC 

CGATGGCArCCGCGACCTGCGCGGCArAGGGGAGCGAATATCCGGTGACGGGGTGGAACA 

S78L » * 6340 

GCTACCG7AGGCGC7GGACGCGCCG7A7CGCG7CGC77A7AGGCCACTGCCCCACCTTG7 

GCCC7GCCCCCAGCCCAACCGGCACC GCGCCC7GCGCG7GG7CGCGCCAGAAGCCTA7GG 

534 1 ^ * 5900 

CGGGACGGGGGTCGGGTTGGCGGTGGCGGGGGACGCGCACCAGCGCGGTCTTGGCATACC 

CG7CA7GGGCCAGCGCGA7GCGCAGGA7GCCCC777CGCGCCCCA7C7CC7GCCCCG7CC 

6901 , 5950 

GCAG7ACCCCG7C0CCC7AC:cC7CC7ACGGGGAAAGCGCGCCG7AGAGGACGCGCCAGO 

AGCCCCGCC7GGCGGCA7AG7CCAGCGACGCC7GCGGCACCCCGCCATCGTCCAGATCGC 

S9«l - 7020 

TCCGCGCCCACCGCCC7A7CAGC7CGC7GCGCACCCGG7CGCGCGG7ACCACC7C7ACCG 
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CGCCG7CGCTGTACCGCG7ATCC7CGA7CACCA7GCGGG7CGGAC7GAACGGCAGCAGAT 

7021 * T <. 7030 

GCGGCAGCGACArCCCGCATACGAGC7AC7C:rTACGCCCACCC7GAC77CCCG7CG7C7A 

AGATGAAGCGC7ACCCG7CCA7C7GCGGAACGGTCKG7CCATGATCATCGGGCGCTCGA 

7081 - + + 7140 

TCTACT7CGCCA7GGGCAGG7AGACGC=77GCCAGC3CAGG7ACTAG?AGCCCGCGAGCT 

CGCCATGGGGGGCGTCGGTCTCGATCTC-ACCCCCACGAATTTCtGGAAACCCACGGtCA 

GCGGTACCCCCCGCAGCCAGAGCTAGAGCTGCGGGTGCTTAAAGACCTTTGGGTGCCAG? 

GGTGCCGGG7C7CGACGGCACCACGGGCS7CGA7CACGCAGGCAGCCTCCATCCGCGAGC 

7201 - - ■, + 72so 

CCACGCCCCAGAGC7CCCG7GG7GCCCGCAGC7AG7GCG7CCG7CGGAGCTAGGCGC7CG 

CG7CCG7CAGCG7CGCGCCCG7A7C37CCAGCG7CGCGACA7GCGTAT7CCACCGCAGA7 

7251 ■> . h ► 7320 

GCACGCAG7CGCAGCGCGGCCA7AGCAGG7CGCAGCGC7G7ACGCATAAGCTGCCG7C7A 

CCACACCCTGCAGCAGCCCCA7CACCCCGCCGGCC7CGATCGAGCCA7AGCC7GTCGTCA 

7321 + T + . 7380 

GC7G7GGGACG7CG7CGGGC7AGTCGCGCCCGCGGAGC7AGCTCCG7ATCGGACAGCAG7 

GGCGGCGCGAATGG7CGGGAAACGCGACC7CC7GATCCG7CCAT7CGCCGCGACGAATGG 

7 381 * * -r r 7440 

CCGCCGCGCT7ACCACCCCTTTGCCCTGGAGGACTAGGCAGGTAAGCGGCGCTGCTTACC 

GCGACAGGCCCGCCAGCCATTCGGGCGAAAGATCCGTGTCGTGGCAGGACCAGGTGTGCT 

CGC7C7CCGCGCGG7CGC7AACCCCGC777C7AGGCACAGCACCG7CCTGGTCCACACGA 

GGTCCGAGGGGCCGGACCGCGCGTCGAGCATCACGATGCGCGCATCCGGTCTGCGGTCGC 

7501 -f * * r 75S0 

CCACGC7CCCCGGCCTGGCGCGCAGC7CG7AC7GCTACCCGCG7AGCCCAGACGCCAGCG 

GAACGCGAAGCGCGATCAGCGCACCGCACAGCCCCGCGCCCGCGATCAGCAGATCA7GGC 

7551 + * + ► 7520 

CTTGCCGTTCGCGCTAGTCGCGTGGCCTGTCGGGGCGCGGGCGCTAGTCGTCTAGTACCG 

TCATGTATTGCGATCCGCCCCTTCCCGGTCGTTCAGCACCGCGCCCGAGCGTTTCAGCTC 

7521 * » + + 7680 

AGTACATAACGCTAGGCGGGGAAGCGCGAGGAAGTCGTCGCGCGGGCTCGCAAAGTCGAG 

TGCCTTGAGGC7GTCGACCGAGGGCGCCCAGATGAAACCGAAGCTGACGCAGTTCTCGCG 

7 531, * * , 7740 

ACGGAACTCCGACAGCTGCC7CCCGCSGGTC7ACTTTGGC7TCGACTGCGTCAAGAGCGC 

GCCATGGACCGCG7GATGCATCC7GTG7GCC7GGTAGACGCGACGAAGA7AGCCGCCC77 

7 74 1 - * + * 7300 

CGG7ACC7GGCGCAC7ACG7AGGACACACGGACCA7C7GCGC7GG77C7ATCGGCGCGAA 

GGGGACATAGCGGAACCGGCAGCCCCCA7CCACCAAGCCG7CATGCAGGAAA7AGTAGA? 

7301 + + 7360 

CCCC7G7A7CCCC7TCCCGC7CCCCGG7ACS7GCT7CGGCAG7ACC7CCTTTATCATCTA 

CACCCCC7AGCAGG7GACCCCCACCGCCAGCCACCAGGCCAGA7CCGACCCCATCGCGCG 

7 3 51 + t 7 920 

G7CGCGCA7CG7CCAC7GGGGG7GGCGGTCGG7GGTCCGG7C7AGGC:GGGGTAGCGCGG 

GA7CGCGAACAGCACGATCGAGAT7ACCGGGAAGATGACGCCATAGAGGTCGTTCTTCTC 

7921 * * ^ + + 79 80 

C7AGCGCTTG7CGTGCTAGCTC7AA7GCCGCT7C7AC7GCGGTATCTCCAGCAAGAAGAG 
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Fi.cr. 24/9 

GAGCGCCTCCTCGTCATCCrCGTCGTGCTGCGATTTATGCCAGCCCCAGCCCAGGGGGCC 

7981 ■ + ~t— 3040 

CTCGCGCACCAGCACTAGGACCAGCACCACC^TAAATACGGTCGGGGTCGGGTCCCCCGG 

ATGCATGATCCACCGATGCACCGAG7ACGCCG7CAGCTCCATCCCGGCGACGGTCAGGAT 

8041 * -i- ->■ -t- + aioo 

TACGTACTAGGTCGCTACCrrCCrrCATCCGGCAGTCGAGGTAGCGCCSCTGCCAGTCCrA 

GACGGTCAGGA7TGCGGCCCAAG7GC7CA7CCCGGCCCC7TGC77GA7A7GACAGGGAAC 

8101 ' =• + + -r + i. 8150 

crGCCAG7CC7AACGCCGGG77CACGAG7ACGGCCGGGGAACGAAC7ATACTGTCCCTTG 

AGGCTACSC7GCCGCGCGG7GCA7GACCAGCCCATCGGGG7GCGACCAAAGGGCATCGCG 

81S1 ♦ + + ► h 8220 

7CCGA7GCGACGGCGCGCCACG7AC7GG7CGGC7AGCCCCACGCTGGTTTCCCGTAGCGC 

TGACATCTGCG7TCAGGGCTCATAGGCGGATCATCCGTGACATTCGCCCCCGAACGCGGC 

8221 - + + -h ► 8280 

ACTGTAGACGCAAG7CCCGAC7A7CCGCC7AG7AGCCACTGTAAGCGGCGGCTTGCGCCG 

AGGCGCATCACCCGTTCCGTCGCTGGAAATATTAATGTTTTCCCGAAGATGGTCGGGGCG 

8281 +• + c ► -h r 3340 

TCCGCGTAGTGCGCAAGGCAGCGACCTT7A7AATTACAAAAGGGCTTC7ACCAGCCCCGC 

AGAGGATTCGAACCTCCGACCTACGGTACCCAAAACCGTCGCGCTACCAGGCTGCGCTAC 

8341 + + + k ( c 8400 

TCTCCTAAGCT7GGAGGC7GGA7GCCA7GCG7777GGCAGCGCGA7GGTCCGACGCGATG 

GCCCCGAC7GCGGAAGGC777AGCCGAT7G77CCGGGAAGGGAAAGACCTAGTCGCAGGC 

8401 * + + + 1 ► 84S0 

CGGGGCTGACGCCTTCCGAAATCCGCTAACAAGGCCGTTCCCTTTCTGGATCAGCGTCCG 

CAGCACCGCAT7CTCCCCC*7GCCCGCA7GCCCCATCGCC7GACCGGCCTTCAGGCCAAG 

84S1 * K j. ► 8520 

GTCCTGGCGTAACAGCGGGTACGGGCCTACGCGGTAGCCGACTGGCCCGAAGTCCGGTTC 

GCCATCCGCC7C7CCGCCCGCGA7T7CGAGGACGAACAGCCGG7CGGGGTCCGGATCGCC 
8521 + + + 1 ► 8580 

GACCGCCGCGCCCGGAATGGGCGTCTCGTCCAGCGGGCGCGCATTGCGGTGGATGTGGCG 

8581 + + ► + 1. + 8S40 

CTGGCGGCGCGGGCCTTACCCGCAGAGCAGGTCGCCCGCCCGTAACGCCACCTACACCGC 

GATGACCCCCG7T7CA7CCGCAAAGACCATG7CCACCGGGATCAGTGTGTTGCGCATCCA 

8541 * + +. c ► 8700 

CTACTCCGGCCAAAGTAGGCGTTTCTGGTACAGGTCGCCCTAGTCACACAACGCGTAGGT 

GAAGGACACCGGCTGGGGCGATtCCTAGATCAACAGCATTCCGGTGCCCGCAGGCAGCTC 

8701 * -r + ». 1- 3760 

CTTCCTGTGGCCGACCCCGCTAAGCATCTACTTGTCGTAAGGCCACGGGCGTCCGTCGAG 

CTTGCGGAACATCAGGCCCTGtGCGCGCTCTTCGGGGCTGTCCGCGACCTCGACCCGAAA 

8751 *■ +. *. i- 8820 

GAACGCCTTGTAGTCCGGGACGCGCGCGAGAAGCCCCGACAGGCGCTGGAGCTGGGCTTT 

CCCGAGCGTTTCCGCACCr^GTATCGACGACAAGACTGCCGGGCGCGCATTCCACCGCCGC 

8821 * » 1- + 8330 

GGGC7CGCAAAGGCG7CGCCA7AGC7CC7G7TC7GACGGCCCGCGCG7AAGCTGGCGGCG 

CGCGGCGCCGGGCATCAGCACCGCAACAAGCGCTGCGGCCTTACTCGGCCACATGGGCAA 

8381 * * ... t » + 8940 

GCGCCGCCGCCCG7AG7CC7CGCC7TC77CGCCACGCCGGAATGAGCCGC7G7ACCCG77 

GATAGGACTGC7CGGCGCCGAGATCCCCCGGGCTGCACGAATTCCATATCAAGCTTATCC 

8 941 — * * + 9000 

CTATCCTGACCAGCCGCGGCTCTACCGCGCCCGACCTCCT7AAGCTATAG7TCGAATAGC 
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Fior. 24/10 

ATACCOTCGACCrCGACOGGGGCCCC.ICTACCCAGCrrTtGTTCCCrrTAGTCAGGGTTA 
9001 * . 9060 

TATGGCAGCTGCAGCTCCCCCCCGCGCCATGGGTCGAAAACAAGGGAAATCACTCCCAAT 

ATTGCGCGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTC 
9061 — - + h 9UQ 

TAACGCGCGAACCGGATTACTACCAGTATCGACAAAGGACACACTTTAACAATAGGCGAG 

ACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTQ3GGTGCCTAATGA 

5121 ~ -r + -c * 9ig 0 

TGTTAAGGTGTGTTGTATGCTCGGCCTTCGTATTTCACATTTCGGACCCCACGGATTACT 

GTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTG 

CACTCGATTGAGTGTAATTAACGCAACGCGAGTGACGGGCGAAAGGTCAGCCCTTTGGAC 

TCGTGCCACC7GCATT AA TGAA rCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGG 

AGCACGGTCGACCTAATTACTTACCCCGTTGCGCGCCCCTCTCCGCCAAACGCATAACCC 

CGCTCTTCCGCTTCCTCGCTCACTGACrCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCG 

9301 * * + + + 1. 9360 

GCGAGAAGCCGAAGGACCGAGTGACTGAGCGACGCGACCCACCAAGCCGACGCCGCTCGC 

GTArCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGA 

9361 -t- i- -r i v + 9420 

CATAGTCGAGTGAGTTTC CGCCATTA7GCCAA 7AGGTG7CTTAGTCCCCTATTGCGTCCT 

AAGAACATGTGACCAAAAGGCCAGCAAAAGCCCACGAACCGTAAAAAGGCCGCGTTGCTG 

9421 * + c _+ (. 9430 

rrC7TG7ACACTCGrrrrCCGGTCGTTTTCCGGTCCrrGGCATTTrTCCGGCGCAACGAC 

GCGTTTTTCCATAGGCTCCGCCGCC-CTGACGAGCATCACAAAAATCGACGCTCAAGTCAG 

9481 -r -r + * y 9540 

CGCAAAAAGGTATCCGAGGCGGGGGGACTGCTCGTAGTGTTTTTAGCTGCGAGTTCAGTC 

AGGTGGCGAAACCCGACACGACTATAAAGATACCAGGCGtTTCCCCCTGCAAGCTCCCTC 

9541 r * * ► -+ 9600 

TCCACCGCTTTGGGCTGTCCTGATATTTCTATGGTCCGCAAAGGGGGACCTTCSAGCGAG 

GTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCG 
9501 * ♦ * * ► h 9SS0 

CACGCGAGAGGACAACGCTCGGACGGCGAATGGCCTATGGACAGGCGGAAAGAGGGAAGC 

GGAAGCGTGCCGC777C7CA7AGCTCACGCTG7AGG7A7CTCAGT7CGG7G7AGGTCG77 

9SS1 * * . „ 9720 

CCTTCGCACCGCGAAAGAG7A7CCAG7GCCACATCCATAGACTCAAGCCACATCCAGCAA 

CGCTCCAAGCTGCGCTGrGTGCACGAACCCCCCCTTCAGCCCGACCGCTGCGCCTSATCC 

9721 -r * , -r 9730 

GCGAGCTTCGACCCCACACACGTGC7TGGGGG0CAAGTCGGCC7GGCGACGCGGAATAGG 

GGTAACTATCGTCTTGAGTCCAACCCGGTAACACACCACTTATCCCCACTCGCACCAGCC 

9781 -r * + 9840 

CCATTGATAGCAGAACTCAGGT-GGGCCATTCTGTGCTGAATAGCGGTGACCGTCGTCGG 

AC7GG7AACACCA7TAGCAGAGCGAGG7A7C7AGCCGG7GCTACAGAGT7CT7GAAG7GG 

9941 ^ ♦ * + -c 9900 

TGACCA TTGTCC7AA7CG7CTCCCTCCA7ACA TCCCCCACGATGTCTCAAGAACTTCACC 

TGGCCTAACTACGGCTACACTACAAGGACACTAT7TCGTATCTGCCCTCTGCTGAAGCCA 

9901 ^ * +. 996O 

ACCGGATTGATGCCGATGT^ATCTTCCTGTCATAAACCATAGACGCCAGACGACTTCGGT 
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CAATGGAAcxrcTrrrrcTCAAccATCGA^ ! 



l0O2 1 fflfiil^ilifinff^ff^^^ 

CCACCAAAJUAACAAACCTTCCTCGTCT^ 10080 
1008! ^Ittl^llll^^^ 

io! u l^fl^l^n*!:^:^^ 

AACCAGTACTCTAATAGTTTTTCCTAGAAGTGGATCTAGGAAAA^TAAT^TTACTTCA l ° 2 °° 
AAATTTAGTTAGATTTCATATATACTCATTTGAACCAG^TGTCAATGGTTACGAATTAG 10250 

TCACTCCGTGGATAGAGTCGCTAGACAGA7AAAGCAAGTAGGTATCAACGGACTGAGGGG 1 " " 

10321 CTCCTGTA ^ ?AAC7AC3ATACGCGAGGCCTTACCATCTgGCCCCACT 

CAGCACATCTATTGATGCTATGCCCTCCCGAATGGTAGACCTCGGTCAC^AC 1 " " 

CCGCGAGACCCACCCTCACCGGCTCCAGAtTTATCAGCAATAAACCAGCCAGCCGGAAGG 
10381 + + + + h — t- 10440 

GGCGCTCTGGGTGCGAGTCGCCGAGGTCTAAATAGTCGTTATTTGGTCGGTCGCCCTrCC 

,„..,, GCCGAGCGOGAACTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAArTGTTGC 
10441 + T ^ + 1 ^ l0S0Q 

C«XTCGCGTCrTCACCAGGACGTTGAA^TAGGCGGAGGTAGGTCAGATAArrAACAACG 
, „ c„ , CGGGAACCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACCTTGTTGCCATTGCT 

iosoi + * <. + „ ^ lO360 

GCCCTTCGATCTCATTCATCAACCGGTCAATTATCAAACGCGTTGCAACAACGCTAACGA 
, „ « . , ACAG ^ ATCGTGGTGTCACGCTCGTC ~^ 

10561 + * + + ( y 10S2Q 

TGTCCGTAGCACCACAGTGCGAGCAGCAAACCATACCGAAGTAAGTCGAGGCQUVGGCTT 

10«i ff A ff AACGCGAGTTAWTGATCCCCCATG ^^KAAAAAAGCC^ 

2 " * * + + + + 10S80 

GCTAGTTCCGCTCAATGTACTAGGGGGTACAACACGTTTTTTCGCCAATCGAGGAAGCCA 

, . , CCrCCGATCGTTGTCAGAAGT>_nGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCA 
10631 + + + ^ h ^ l074Q 

GGAGGCTAGCAACAGTCTTCATTCAACCCGCGTCACAATAGTGAGTACCAAXACCGTCGT 

CTGCATAATTCTCTTACTGTCA.TGCCATCCGT.tAGATGCTTTTCTGTGACTGGTGAGTAC 
10741 - _ ^ • 103Q0 

GACGTATTAAjAGAATGACAGTACGwTAGGCATTCTACGAAAAGACACTGACCACTCATG 

, na „, tcaaccaactca ^^c:ag^-agtgtatgcggcgaccgagttgctcttgcccggcgtca 

10801 * «. * + ^ L03S0 

AGTTGGTTCAGTAACACTCTTATCACATACGCCGCTGGCTCAACGAGAACGGGCCGCACT 

ATACGCGATAATACCGCCCCACATAGCAGAACTTTAAAACTGCTCATCATTGCAAAACGT 
10861 * _ t ^ + + 1Q920 

TATGCCCTATTATGGCGCGGTGTATCGTCTTGAAATTTTCACGAGTACTAACCTTTTGCA 
t09 2 1 I! I!!!^?^?-?^!:!;;?^?^:: TTACCGCT 5Tf GACATCCAGTTCGATGTAACCC 

agaagcccggcttttgagagttccta^aatgccgacaactctaggtcaIcctIcIttgg^ 10980 
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Pi*. 24/12 

AC7CG7GCACCCAAC7GA7C77CACCA7C7777AC777CACCAGCGr:7C7GGG7GACCA 

10981 * * ~ ► 110+0 

TGAGCACG7GGG77CAC7AGA.1G7CG7AGAAAA7GAAAGTGG7CGGAAAGACCCACTCG7 

AAAACAGGAAGGCAAAA7GCCCCAAAAAAGGGAA7AAGGGCGACACGGAAATG7TGAA7A 

11041 * '<■ 11100 

TrTrG?CG77CCG7T~7ACGGCG7T7T:7CCCTrA7TCCCGCTG7GCCTT:XCAAC*rTAT 

C7CA7AC7C77CC777T7CAA7A77A77CAAGCA777A7CAGGGrrAT7GTCTCATGAGC 

GAG7A7GAGAACGAAAAAG77A7AA7AAC77CG7AAATAGTCCCAATXACAGAGTACTCG 

GGA7ACATA777GAATG7A777ACAAAAA7AAACAAA7AGGGG77CCGCGCACATrTCCC 

11151 * * ' - 11220 

CC7ATG7A7AAAC77ACA7AAA7C77777A777G777A7CCCCAAGGCGCGTGTAAAGCG 

CGAAAAGTGCCAC 
11221 * 11233 

GcrrrrcAccGTG 
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Fig. 26 
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Fig. 29 
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Fig. 30/1 



ACTCTACTCTGCC^GGATCGCCGOTCCGGGGGACA^ 

TCACATCAGACa^CTAGCGGCCAGGCCCCCTCTTCTATACTCGCGT^ 

AAGGCAGATCTC^CCGCCACCAGTTTGATC^^ 



ttccgtcragactggcggtggtcaaactagcagagcccgccgtagtagcggcgcaccgac 

gccctgcatgtccatcx:gctgtggtttct(^cgcgccggccx^ 

cgggacgtacacgtacgcgacaccaaagacctgcgccgccgcgtagggtaggaccgccag 

gcgaatttcctggggctgacctggctgtcggtcggtctgttcatcatc 

cgcttaaagc^ccccgactggaccgacagccagccagacaagtagtagcgcgtactgcgc 



TACCTACCCAGCCAGCACCSGCCCCGCGGGCGCGCGCnTrACKCGCTACCCGGTCGAACAG 
CTGTGGCTCTATGCCGGATTTTCCTC^^ 

GAC^CCGACATACGGCCTAAAAGGACCGCGTTCTACTAGCAGITCGTGTACCGGGTAGTA 
CGCCATC^GOVACCGACGACGACCCAGA^ 



^AAAGCTGGTACCGCCCSGGCCAGGCGACCATG 
(XXOXTICATaK^CCTATTrCGC^^ 

ACGGTCTATGCGCIGATCrrTGaJGGATCGCTGX^ 
TGCa^TACGCCaCTACAACCC^ 

TcxiATCcnGK(7rcc»TCCACxriCTrccrr(7TO 



AGCTAQGACCGCACCTAGGTCGACAAGCACAAGCCGTAGACCGACGGCGTGGCGGGGCCG 

cacgacgcgttgccggaccgccacaatc 



GTGCIGCGCAAOXXCTGGCGCtGTTACGCGCCAG^ 

CTGCTGACCTGCTTTCACTTTGGCGGTTATCATCA^ 

GACGACTCXIACGAAAGTGAAACCGCCAATAGTAGTGCT^ 

CCTTGGTGGCGCCTGCCCACrACCCGCACCAAGGGGGACACCG^ 

GGAACCACCGCGGACGGGTCCTrGGGCGTGCrn^CCCCTGTXXX 

TCGTCGTCGCCACCGTCXrrGGTGATGGAGCTGACGGC^ 

AGCAGC^GCGGTOXACGACCACTACCTCGACTGCCGCATAAGGCAGGTGGCGACCTAGT 



BNSDOCID- <EP_0872554A2J_> 



EP 0 872 554 A2 



Fig. 30/2 



841 + ■»- 1- + -t- + 900 

TCGAAAAC»ACGAOTCTACC^^ 

90 i * ~ + + + 960 

ACCTTTTCrTGCTGGACATGCCGGACCAGAAACGCCA 

TGGGCTGGATCTGGGCACCGCjrCCTGTGGTGGATCGCCTTGG^^ 

951 ♦ + + + + + 1.020 

ACCCGACCTAGACCCGTGGCCAGGACACCACCTAGCGGAACCCGTACTGGCAGATGCCCG 

1021 -f -r -t- + + 1080 

ACTAGATAAAGCAGGACGTACTGCCCGACCACGTAGTCGCGACCGGGAAGGCGATATAGG 

CTCGCAAGGGCTATGCCAGACGCCTCTATCAGGCC(^CCGCCTGCACCAC 

1081 + + + + * + 1140 

GAGCGTTCCCGATACGGTCTGCGGACATAGTCCGGGTGGCGGACGTGOTGCCKCAGCTCC 

GGCGCGAOCATTGCGTCAGCTTCGGCTTCATCTATGCGCCGC CGGTCGACAAGCTGAAGC 

1141 1- * + * + + 1200 

CCGCCXTTGCTAACGCAGTCGAAGCCGAACnVVGAT^ 

AGGACCTQAAGACGTCGGGCGTGCTGCGGGCCCIAGGCGCAGGAGCGCACGTGACCC^ 

1201 * + ♦ f + + 1260 

TCCTGGACrn:ZXKAGCCCGCACGACGGCCGGCTCCGCGTCCTCGGGTGCACTGGGTACT 

C 

12S1 - 1251 
G 
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Fig. 31 



ATGAGCGCACATGCCCTGCCCAAGGC^^ 
1 +. + + + + + 60 

GGCATC^TCGCCGCGTGGCTGGCCCTGCATGTGCAT^ 
61 + +■ + + + + 120 

121 4. + + + + + 180 

cgcgtagggtaggaccgccagc&tttaaaggaccccc^^ 
ttcatcatcgcgcatgacgcgatgcatgggtcggtcgtgcc^ 

181 + + + + + * 240 

241 «. + + +. * + 300 

CGCCGCTACCCOTO^CACX^^ 

GTCAACX^CATGGCCCATOVrCGC^ 
301 + + + «• + + 360 

GGCGGCCCSSTCCGCTGGTACGCCCGCTTCAiCGGCACGTATTTCGGCTGGCGCGAGG^ 
361 + + + +• + + 420 

CTGCTGCTCXXCGTCATCGTGACGGTCTATG^ 

421 + + + -f r + 430 

GACC3ACGACCXX»^AG<^CTGCaVGATACGCGACrA(^ 

481 -t- * '- -t- -t- + + 540 

CACCAGAAQVCCGGCAACGCXZAGCTAGGACCGCAGCTAGGTCGACAAGCACAAGCCGTAG 

541 + + + + +— + 600 

ACCGACGGCCT^GGGGCaXryX^ 

CGGATC^GCGACCCCCTtCTCGCTCCrGACCnXr^ 

GCCTAGTCGCTGGGGCACAO:^^ 

CACCACCTOlACCCC^CCXTrGCCTTCGTC^ 

661 * + + + + + 720 

GTGGT5GACGTCGGCTGCCACGGAACCACCGCGGACGGCTCGTGGGCCT 

ACCGCATGA 

721 729 

TGGCGTACT 
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Fig. 32 



L MSAHALPKAD LTATSLIVSG GIIAAWLALH VHALWFLDAA AHPILAVANF 

51 LGLTWLSVGL FtlAHDAMRG SWPGRPRAN AAMGQLVLWL YAGFSWRKMI 

101 VKHMAHHRHA GTDDDPDFDH GGPVRWYARF IGTYFGWREG LLLPVIVTVY 

151 ALMLGORWMY WFWPLPSIL ASIQLFVFGI WLPHRPGHDA FPDRHNARSS 

201 RISDPVSLLT CFHFGGYHHE HHLHPTVPWW RLPSTRTKGD TA* 
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Fig. 33 



ATGACCAATTrCCTGATCGTCGTCO:^ 

1 + + + + + +- 60 

TACTGGTTAAAGGACTAC^GCAGCCM^ 

CTCCACCGCTCXIATCATCXIACGGCCCCTTG^ 

61 ♦ + + + + * 120 

CAGGTGGCGACCTACTACOTSCCC*3G<^ 

GAACACGACCACGCC3CTGGAAAAGAACGACCTGTACGGCCTG<rrCTTT^GGTGATCGCC 

121 + + + + + *■ 180 

CTTCTGCTGCTCCCKGACCTrTT^^ 

181 + + + + * + 240 

TGCCACGACAACTI^CACCCGACCTAGACCCGTGGCCAGGACACCACCTAGCGGAACCCG 

ATGACCGTCTACGGGCTGATCTATTrCGTCCTGCATGACGGGCTG<nt3CATCA<^GCTG3 

241 +• + -t- + + + 300 

TACIGGCAGATGCCCGACTAGATAAAGCAGGACGTACTGCCCGACCACGTAGTCGCGACC 

CCGTTCC<XrTATATCCCTCGCAAGG<XrTATGCCAGACGCCTGTATCAGGCCCACC 

301 + + + + + + 360 

OGCAA«XGATATAGGGA<K<7riCCCGATAC»3TCKCGGACATAGTC 

CACCACGCGGTCGACKXX3CGCGACCATTGCGTCAGCTTCGGCTTCATCTATGCGCCGCCG 
361 * + + + ♦ + 420 

gtggtgcgccagciccccgcgctggtaacgcagtcgaagccgaagtagatacgcggcggc 
gtcgacaagctgaaggaggacctgaagacgtcgggcgtoc^^ 

421 * * + + +• + 480 

CGCACG 

431 486 

GCGTCC 
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Fig- 34 



1 MTNFLIWAT VLVMELTAYS VHRWIMHGPL GWGWHXSHHE EHDHALEKND 

51 LYGLVFAVIA TVLFTVGWIW APVLWWIALG MTVYGLIYFV LHDGLVHQRW 

101 PFRYIPRKGY ARRLYQAHRL HHAVEGRDHC V3FGFIYAPP VDKLKQDLKT 

151 SGVLRAEAQE RT 
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Fig- 35 
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Fig. 36 
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Fig. 37 
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Fig. 38/1 



CTGCAGGTCTGACACGGCCAGAAGGCCCCGCCGCCGGcCGGCGGCCGCcGCAreCCGACC 
1 + + + + + + 60 

GACGTCCAGACTGTGCCGGTCTTCCGGCGCGGCGCCCgGCCCCCGGCGgCGTAGCGCTGG 

GGTATCCTTGCCAAGCGCCGCCTGGTCGCCCACaACGTCCAGCAGGTCGTCArAGGACTG 

61 + + ♦ + + + 120 

CCATAGGAACGGTTCGCGGCGGACCAGCGGGTGCTGCAGGTCGTCCAGCAGTATCCTGAC 

GAACACCCGGCCCAGCTGACGGCCAAAGTCGATCATCTGaGTCTGCTCCTCGGCGTCGAA 

121 ■»■ ♦ + + + + 130 

CTTGTGGCCCGGGTCGACTGCCGGTTTCAGCTAGTAGACtCAGACGAGGAGCCGCAGCTT 

CTCCTTGATCACGGCCAGCATCTCCAGCCCGGCGATGAACAGCACGCCGGTCrrCAGGTC 

181 * + + + + + 240 

GAGGAACtAGTGCCGGTCGTAGAGGTCGGGCCGCTACTTGTCGTGCGGCCAGAAGTCCAG 

241 + + ♦ + + + 300 

GACAAGGACAAGCTGGGGGCGCGGCAACAACCGGCGCACGTCCAGGTCCAGGACCGGCCG 

GCACAGGCCCTGCGGCCCCAGGGACCGCGACAGGATCCgcaecagcCgegcccgcacegt 

301 + + + + + + 3so 

CGTGTCCGGGACGCCGGGGTCCCrGGCGCTGTCCTAGGcgtggtcgacgcggrgcgcggca 

gcccgacgcgccgcgcgcaccggccagcagggccatcgcctcggtgatcagggcgatgcc 

361 ♦ + + + + + 420 

cgggccgcgcggcgcgcgtggccggtcgtcccggtagcggagccaccagtcccgctacgg 

gcctagcacggcgcggctctcgccacgegccacatgggtcgcgggccggccgcggcgcag 

421 +• + + + + + 480 

cggategTgeegcgcegaaagcggtaegcggtgtacccagcgeccgaccggcgcegcgtc 

cccggcatcgtccatgcagggcaggccgtcgaagatcagcgatgcggcacgcaccatctc 

481 + *■ + + + + 540 

gggcegtagcaggeacgtcccgtccagcagctcctagtcgccacgccgtacgtggtagag 

gaccgcgcaggcggcgccgacgaccgcgtcgcagaccccgcccgaggctcctgccgcaag 

541 «. + 1 + + + 600 

ctggcgcgtccgccgcagctgctagcacagcgcctggggcgggctccgaagacggcgtcc 

cagcatcagcatgecgcggaaacgetegeccgacgacagcgcgccatggetcatggecgg 

601 -f ♦ ♦ +— r + + 660 

gt ogc agt cgt acggcgcct t Cgcgaacgggctgctgc cgcgcggt accgagcaccggcc 

gccgagcggctgcgacacggcaccgaatccctgggcgatctcctcaagtetggtctgcag 

661 + v h + + + 720 

cggctcgccgacgctgcgccgcggcttagggacccgctagaggagttcagaccagacgcc 

aagggtggcgtggatcgggttgacgcctcgtctcatcagtgccttcgcgcttgggtcctg 

721 + + + + + + 780 

ttcccaccgcacctagcccaactgcagagcagagtagtcacggaagcgcgaacccaagac 

accaggcgggaaggtcaggccggggcggcaccccgtgacccgtcatccaccgtcaacagt 

tggtccgcccttccagtccggccccgccgtggggcactgggcagtaggtggcagttgtca 

ccccacgttggaaggcttcacgcccgattgcgagccttttcgacggcgacgcggggtcgc 

841 + +• ♦ + + + 900 

ggggtacaaccttccgaagtgcgggctaacgctcggaaaagctgccgctgcgccecagcg 

gcggcaatccntccaaeaaggecaqcggaccggcgcgcegaeggccgcgcgcagccaggc 

901 -i- * + ♦ * 960 

cgccgttaaanaggctgctccagecacctggecgcgcggctaccggcgcgcgtcggcccg 

acccttggccggaaacacccgcgccgcatcacgaccggccaggaccgtccggcgcgcggc 
961 + + *• + + ♦ 1&20 
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Fig. 38/2 

Caqgaaccggcccccgcgggcgcggcgcagtactagccggtcctagcagqccgcgcgccg 

gcggcgcaggtcggccgcgtcacccggattgtcaagcacccaggccatcgcgtccgcgac 

1021 -• + + + + + + 1080 

cgccgcgtccaqccggcgcagtgqqcctaacagttcgtgggtccggtagcgcaggcgctg 

cccgtecgcgtcgtccatgtcgacgaccaggccgctctccacgccgcggaccagctcgcg 

1081 + ♦ ♦ + * + li40 

gagcaggcgcagcaggtacagccgctagtccggcaagaggtacagcgcctggccaagcgc 

caecggggcggtgttcgatcgaccaccaggcacccggtggccaccgccccggacagggac 

1141 * ♦ ♦ + + 1200 

gcggccccgccacaagccagccagcggc ccgc aggccaccggc agcggagcccgc ccctg 

caggaggtgacgaagggctcggtgaaacagacatgcgcgtgcgaggcctgcag 

1201 ♦ * + + 1253 

gtcctccactgcttcccgagccactttatccgtacgcgcacgctccggacgtc 
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Fig. 39 

ATGAGACGAGACGTCAACCCGATCCACGCCACCCTTCTGCACACCAGACTTGAGGACATC 

I + + + * ♦ + 60 

TACTCTGCTCTGCAGTTGGGCTAGGTGCGGTGGGAAGACGTCTGGTCTGAACTCCTCTAG 

GCCCAGGGATTCGGTGCCGTCTCGCAGCCGCTCGGCCCGGCCATGAGCCATGGCGCGCTG 

61 + + + + + + i2a 

CGGGTCCCTAAGCCACGGCACAGCGTCGGCGAGCCGGGCCGGTACTCGGTACCGCGCGAC 

TCGTCGGGCAAGCGTTTCCGCGGCATGCTGATGCTGCTTGCGGCAGAAGCCTCGGGCGGG 

121 + + + + + + 180 

AGCAGCCCGTTCGCAAAGGCGCCGTACGACTACGACGAACGCCGTCTTCGGAGCCCGCCC 

GTCTGCGACACGATCGTCGACGCCGCCTGCGCGGTCGAGATGGTGCAXGCCGCATCGCTG 

181 + + * + * ♦ 240 

CAGACGCTGTGCTAGCACCTGCGGCGGACGCGCCAGCTCTACCACGTACGGCGTAGCGAC 

ATCTTCGACCACCTGCCCTCCATGGACGATGCCCGGCTGCGCCGCGGCCAGCCCGCGACC 

241 '■ ♦ + + ' ♦ * + 300 

TAGAAGCTGCTGGACGGGACGTACCTGCTACGGCCCGACGCGGCGCCGGTCGGGCGCTGG 

CATGTGGCGCATGGCGAAAGCCCCCCCGTGCTAGGCGGCATCGCCCTGATCACCGAGGCG 

301 + + * + * + 360 

GTACACCCCGTACCGCTTTCGGCGCGGCACGATCCGCCGTAGCGGGACTAGTGGCTCCGC 

ATGGCCCTGCTGGCCGGTGCGCGCGGCGCGTCGGGCACGGTGCGGGCGCAGCTGGTGCGG 

361 + * + + +. + 4 20 

TACCGGGACGACCGGCCACGCGCGCCGCGCAGCCCGTGCCACGCCCGCGTCGACCACGCC 

ATCCTGTCGCGGTCCCTGGGGCCGCAGGGCCTGTGCGCCGGCCAGGACCTGGACCTGCAC 

421 + , ♦ > + 480 

TAGGACAGCGCCAGGGACCCCGGCGTCCCGGACACGCGGCCGGTCCTGGACCTGGACGTG 

GCGGCCAAGAACGGCGCGGGCGTCGAACACCAACAGGACCTGAACACCGCCGTCCTGTTC 

481 * * * ♦ + * 540 

CGCCGGTTCTTGCCCCGCCCCCAGCTTGTCCTTGTCCTGGACTTCTGGCCGCACGACAAG 

ATCGCCGGGCTGGAGATGCTGGCCGTGATCAAGGAGTTCGACGCCGACGAGCAGACTCAG 

S41 + * + + 600 

TAGCGGCCCGACCTCTACGACCGGCACTAGTTCCTCAAGCTGCGGCTCCTCGTCTGAGTC 

ATGATCGACTTTGGCCGTCAGCTGGGCCGGGTGTTCCAGTCCTATGACGACCTGCTGGAC 

601 + + * * * * 660 

TACTAGCTGAAACCGGCAGTCGACCCCGCCCACAAGGTCACGATACTGCTGGACGACCTG 

GTTGTGGGCGACCAGGCGGCGCTTCGCAACGATACCGGTCGCGATGCGGCGGCCCCCGCC 

661 ♦ ♦ * + «■ + 720 

CAACACCCGCTGGTCCGCCGCGAACCGTTCCTATGGCCAGCGCTACGCCGCCGGCGGCCG 

CCGCGGCGCGGCCTTCTGGCCGTGTCAGACCTGCAGAACGTGTCCCGTCACTAtGAGGCC 

721 ♦ * + * * + 780 

GGCGCCGCGCCGGAACACCGGCACAGTCTGGACGTCTTGCACAGGGCAGTGATACTCCGG 

AGCCGCGCCCAGCTGGACGCGATGCTGCGCAGCAAGCGCCTTCAGGCTCCGGAAATCGCG 

781 + + -t- + *■ + 840 

TCGGCGCGGGTCCACCTGCGCTACGACGCGTCGTTCGCGGAAGTCCGAGGCCTTTAGCGC 

CCCCTGCTGCAACGGGTTCTGCCCTACGCCCCGCGCCCCTAC 

841 * * + a8 2 

CGGGACGACCTTGCCCAAGACGGGATGCGGCGCGCGCGGATC 
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Fig. 40 

1 MRRDVNPIHA TLLQTRXEEI AQGFGAVSQP LGPAMSHGAL SSGKRFRGML 

51 MLLAAEASGG VCDTIVDAAC AVEMVHAASL IFODLPCMDD AGLRRGQPAT 

101 HVAHGESRAV LGGIALITEA MALLAGARGA SGTVRAQLVR. ILSRSLGPQG 

151 LCAGQDLDLH AAKNGAGVEQ EQDLKTGVLF IAGLEMLAVI KEFDAEEQTQ 

201 MIDFGRQLGR VFQSYDDLLD WGDQAALGK DTGRDAAAPG PRRGLLAVSD 

251 LQNVSRHYEA SRAQLOAMLR SKRLQAPEIA ALLERVLPYA ARA* 
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Fig. 42 
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